# Converting Strings to Reals in a List

I am new to Mathematica and working with streamed data:

`````` str = OpenRead["https://s3.amazonaws.com/nyc-tlc/trip+data/yellow_tripdata_2019-03.csv"]
``````

I'd now like to convert the records from the streamed file into a new list, by converting all numbers to reals. It seems like the default storage method of the data is string for any entry. Correct? In particular, I would need to convert the following variables to reals: Vendor ID, passenger_count, trip_distance, RatecodeID, PULocationID, DOLocationID, payment_type, fare_amount, extra, mta_tax, tip_amount, tolls_amount, improvement_surcharge, total_amount and congestion_surcharge.

Unfortunately, I am struggling to do this - any help would be greatly appreciated!

• For a test, I only read the first 5 records:

``````str = OpenRead[
"https://s3.amazonaws.com/nyc-tlc/trip+data/yellow_tripdata_2019-03.csv"]
``````

Now we extract the header and the data and split them into the corresponding table elements:

``````header = StringSplit[mem[[1]], ","]
dat = StringSplit[Rest[mem], ","];
``````

The rows of `dat` consists of strings that represent a mixture of numbers and strings. To transform the strings representing numbers we need to to pick out the corresponding columns:

``````(dat[[All, #]] = ToExpression[dat[[All, #]]]) & /@ {1, 4 ;; 6, 8 ;;}
``````

To get the final table we prepend the header to the data:

``````PrependTo[dat, header]
``````

If in addition you want also to change the date strings into numeric lists, you may say:

``````dat[[All, 2]] = DateList /@ dat[[All, 2]]
dat[[All, 3]] = DateList /@ dat[[All, 3]]
``````

Or with `DateObject`:

``````  dat[[All, 2]] = DateObject /@ dat[[All, 2]]
``````

• The third argument according to documentation is for:

reads only the first n objects of the specified types.

This code will run on 3 record (first row is column):

``````records = ReadList[str, Record, 3];
``````

the output is a list of strings, use `SemanticImportString` to convert strings to their data types (Reals, ...).`SemanticImportString` needs a string with `\n` as the record separator.

Use `StringRiffle` to concatenate records with `\n` in between.

``````SemanticImportString[#, Automatic, "Rows", Delimiters -> ","] &@
[email protected][records, "\n"]
``````

Output:

Using `"Dataset"` form instead of `"Rows"`:

``````SemanticImportString[#, Automatic, "Dataset", Delimiters -> ","] &@
[email protected][records, "\n"]
``````

Output:

The output is a list of records with different types (String, Reals, ...). See `SemanticImport` documentation to read more about other output forms.