I am attempting to import a CSV file in R to do fraud analysis with linear / fast forward regression. What has become simpler is complicated ... This data set has 26 variables and more than 2 million rows. I used this command line to import the CSV file:
Data & lt; - read.csv ('C: /Users/amartinezsistac/OneDrive/PROYECTO/decla_cata_filtrados.csv', header = TRUE, Sep = ";")
However, R only 1 variable Imported 2.3 million rows in For more information, after this step, I will add to see the (data)
I am here. I sep = ";" Trying to switch from "sep =", using:
data
but found this error message:
error in read.table (file = file) , Header = header, sep = sep, quote = quote: column more than column names
I read c reads.csv csv2 (as a result of 2.3 million rows and 1 variable) Or fill = TRUE option (the same result), however the import is not correct.
I already appreciate any tips or help to fix it.
Break down problems Initially I will try something
file & lt; - 'C: /Users/amartinezsistac/OneDrive/PROYECTO/decla_cata_filtrados.csv' read.csv (file header = F, Skip = 1, Sep = ',', Nero = 1)
If it is a line and 26 col A data with fieldwork. If you are in business, if you do not, then check again through the arguments of read.csv
read.csv (file, header = t, leave = 0, Sep = ',', Nrow = 1)
This should give you the same line data. Frame, but correct with the names of columns - if not checked then the CSV file In the correct number of columns, continue to leave the first row, or headers and name it after assigning the column names.
Now increase the nrow
, initially to 10, then maybe 10 until you read the entire file, or you press a problem nrow
to determine, and unless you find the exact problem line.
Look at the CSV in Excel, which is specific about this line - there is a strange character, unmatched quote, fewer entries ... this will affect how you deal with this problem.
Repeat until your entire file is read!
No comments:
Post a Comment