Friday 15 February 2013

string - Using grep, grepl and regexpr within loops in R -


I want to automate the removal of some information from text files using grep, grepl and regexpr. I have a code that works when I do it for each individual file, although I can not get a loop to work, to automate the process of all the files in my working directory

< P> I am reading as a string due to the structure of the data in test files. The loop has to be guided by the number of files multiple times through the first file, explicitly for the length (txtfiles) command in the for statement (I1 :
  txtfiles = list.files (pattern = "* .txt") hours of operation [i] & lt; - all_data [hours_of_operation & lt; - grep ("Annual Hours of Operation:", all_data)] hours_p [i] & lt; -regmatches (hours_op, regexpr ("[0-9] {1, 9}. [0- 9] {1,9}", hours_op))}  

I can be grateful if this routine for the file is in the right direction to be repeated, I would end up with a list of file names and related hours_op many times instead of the same file Want to

You must either add each index ( [i] ) In 1: length of (txtfiles) from one of your references to hours_op [i] as:

  (all_data <- readLines (txtfiles) [I] <-regmatches (hours_op [i], regexpr ("[0-9] hours) [i] & lt; - all_data [hours_of_operation & lt; - grep (" Annual hours of operation: ", all_data )] Hours_ap [i]] {1,9}. [0- 9] {1,9} ", hours_p [i])}  

Or better yet, a temporary variable Use:

 For  (i in 1: length (txtfiles)) {all_data & lt; - readlines (txtfiles [i]) temp & lt; - all_data [hours_of_op Eration & lt; - grep ("Annual Hours of Operation:", all_data)] hours_p [i] & lt; -regmatches (temp, regexpr ("[0- 9] {1,9}. [0- 9] { 1,9} ", Temporary))}  

No comments:

Post a Comment