Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>please find below a suggestion on how to potentially improve things a bit. Note that I'm providing a slightly more general example here such that others can easily reproduce.</p> <pre><code>require(XLConnect) # *** Generate some dummy files *** for(i in 1:10) { data = as.data.frame(matrix(rnorm(10000), ncol = 10)) names(data) = LETTERS[1:10] writeWorksheetToFile(file = sprintf("test%s.xls", i), data = data, sheet = "data", header = TRUE) } # *** Process files *** # Get files to process files = list.files(pattern = "^test[0-9]+.xls$") # Read chunks of data from files and subset data.negative = lapply(files, function(fl) { # Read data from file data = readWorksheetFromFile(file = fl, sheet = "data", header = TRUE) # Which rows have all values &lt; 0 idx = apply(data, 1, function(x) all(x &lt; 0)) data[idx,] }) # How many rows of all zeros does each chunk have? nrows = sapply(data.negative, nrow) # Combine data.negative into one data.frame data.negative = do.call(rbind, data.negative) # For each row add from which file it is originating data.negative$File = rep(files, times = nrows) # Write output file write.table(data.negative, file = "neg_val.txt", sep = "\t", quote = FALSE) </code></pre> <p>The idea is to NOT subsequently rbind the data.frames which makes things slow (depending on the size of your data.frames). In your case I would suggest doing the read and subset via lapply and then combine the subsets together for writing to a file. Also note that you can easily switch the lapply to e.g. plyr's llply and hook a parallel backend to it for parallelizing that task (however your disk might be a bottleneck if you attempt many parallel reads).</p> <p>Hope that helps.</p> <p>Best regards, Martin</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload