Converting Strings to Reals in a List
I am new to Mathematica and working with streamed data:
str = OpenRead["https://s3.amazonaws.com/nyc-tlc/trip+data/yellow_tripdata_2019-03.csv"] memory=ReadList[str, Record,500];
I'd now like to convert the records from the streamed file into a new list, by converting all numbers to reals. It seems like the default storage method of the data is string for any entry. Correct? In particular, I would need to convert the following variables to reals: Vendor ID, passenger_count, trip_distance, RatecodeID, PULocationID, DOLocationID, payment_type, fare_amount, extra, mta_tax, tip_amount, tolls_amount, improvement_surcharge, total_amount and congestion_surcharge.
Unfortunately, I am struggling to do this - any help would be greatly appreciated!
For a test, I only read the first 5 records:
Now we extract the header and the data and split them into the corresponding table elements:
The rows of
datconsists of strings that represent a mixture of numbers and strings. To transform the strings representing numbers we need to to pick out the corresponding columns:
To get the final table we prepend the header to the data:
If in addition you want also to change the date strings into numeric lists, you may say:
The third argument according to documentation is for:
This code will run on 3 record (first row is column):
the output is a list of strings, use
SemanticImportStringto convert strings to their data types (Reals, ...).
SemanticImportStringneeds a string with
\nas the record separator.
StringRiffleto concatenate records with
"Dataset"form instead of
The output is a list of records with different types (String, Reals, ...). See
SemanticImportdocumentation to read more about other output forms.