Converting Strings to Reals in a List
I am new to Mathematica and working with streamed data:
str = OpenRead["https://s3.amazonaws.com/nyc-tlc/trip+data/yellow_tripdata_2019-03.csv"]
memory=ReadList[str, Record,500];
I'd now like to convert the records from the streamed file into a new list, by converting all numbers to reals. It seems like the default storage method of the data is string for any entry. Correct? In particular, I would need to convert the following variables to reals: Vendor ID, passenger_count, trip_distance, RatecodeID, PULocationID, DOLocationID, payment_type, fare_amount, extra, mta_tax, tip_amount, tolls_amount, improvement_surcharge, total_amount and congestion_surcharge.
Unfortunately, I am struggling to do this - any help would be greatly appreciated!
Answers 2
For a test, I only read the first 5 records:
Now we extract the header and the data and split them into the corresponding table elements:
The rows of
dat
consists of strings that represent a mixture of numbers and strings. To transform the strings representing numbers we need to to pick out the corresponding columns:To get the final table we prepend the header to the data:
If in addition you want also to change the date strings into numeric lists, you may say:
Or with
DateObject
:The third argument according to documentation is for:
This code will run on 3 record (first row is column):
the output is a list of strings, use
SemanticImportString
to convert strings to their data types (Reals, ...).SemanticImportString
needs a string with\n
as the record separator.Use
StringRiffle
to concatenate records with\n
in between.Output:
Using
"Dataset"
form instead of"Rows"
:Output:
The output is a list of records with different types (String, Reals, ...). See
SemanticImport
documentation to read more about other output forms.