Tuesday, 15 September 2015

r - speeding up as.POSIXct with large data / issue with storing as POSIXct in data.table -


I try to remove a large .csv (~ 11m) POSIXct list of login bar Am doing the rows), then use the cut function to log in for 15 minutes of code per code.

Looking at the size of the dataset, I am using the data.table function. I have been successful in achieving my purpose, although I have participated in some of the problems described below:

#selective fread dt & lt; - fread ("foo.csv", colClasses = list (NULL = c) (1: 5,8: 14), "POSIXct" = c (5,6))

Problem: I tried to store 2 related columns as POSIXct classes but instead it is stored as a character class:

< P> & gt; Class (DT $ login_dataite) [1] "Character"

I have been able to run my code using the rest as as.POSIXct :

timelog & lt; -dt [, 1, with = FALSE] timeLog & lt; - Time Log [, login_datetime: = as.POSIXct (login_datetime)] Tablet & lt; - data.frame (table (cut (timelog, break = "15 min")))

However, the second line takes approximately 12 minutes to run on my machine. I need to process more datasets in a similar type, and 12 minutes are not terribly slow, so I'm curious that I can slow down the process (lack of hardware upgrade).

Specifically, I tried to get the fread to store the related code POSIXct classes directly and I was unable to read about POSIXct I was unable to find anything in it. Can anyone tell me if 1) I fread and colClasses = "POSIXct" , or 2) if any other code / package is < Code> data.table column to accelerate the conversion of POSIXct?

Thank you.

I suggest two options.

I think that you use write.csv or similar, while typing it from POSIXct to character Convert. This is slow in both writing and reading, because POSIXct objects are actually numbers and are not eligible (more accurately they are seconds from "era") You can convert the column to numeric , and then write it down, and convert it back to POSIXct in reading (which will be super fast).

Another option, if you prefer to write character columns, then fastPOSIXct to fasttime to use POSIXct Increase the speed of conversion in


No comments:

Post a Comment