How to do computations with a specific column of a stream?
I'm trying to calculate the mean of a specific column of a huge dataset (given as .csv-file) using a stream in mathematica.
I tried the following:
stream = OpenRead["Directory/data.csv"];
list = {};
Do[record = Read[stream, Record];
AppendTo[list, [email protected]@StringSplit[record, ","]];, {10000001}]
[email protected]
Using this already gives me the mean of the first column of the dataset. How should I change the code to select the 4th column of the dataset with the heading "age" to calculate the mean of the age? The "10000001" in the code above refers to the number of rows (including the heading) that are contained in the csv-file.
Thanks a lot for any help!
Answers 1
You could simply change
First
to#[[4]]&
, a function which takes the fourth part of its argument (its argument being a particular row).where I'm using parentheses because I don't know exactly how the
@
s will be parsed.Edit: there's no need for implicitly defined functions, it's of course possible to just take the fourth part of the list directly