Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>I don't have a full answer to why the times for lapply are not linear, but notice that you are using lapply as a glorified loop by applying over a list of variable names (and then using these names to index the actual values) rather than applying over a list of values. I'm not sure how to solve this using lapply, but I can offer an alternative that is (roughly) a liner function of the number of variables. </p> <p>Start by creating the example data:</p> <pre><code>Genedata &lt;- structure(list(time = c(120, 120, 28.96, 119.21, 59.53, 68.81, 82.29, 110.82, 65.88, 84.13, 16.47, 89.75, 76.07, 67.82, 52.24, 64.1, 55.13, 57.79, 50.1, 43.79, 30.27, 3.64, 36.59, 20.02, 118.58, 55.33, 120, 120, 120, 106.94, 11.7, 28.79, 26.82, 52.5, 24.03, 88.99, 102.44, 33.73, 85.28, 26.53, 37.31, 86.43, 55.92, 70.65, 76.04, 79.13, 83.99, 80.25, 40.6, 95.37, 31.06, 37.31, 22.02, 63.25, 34.09, 52.14, 66.04, 59.96, 47.86, 58.45, 45.99, 60.42, 37.67, 15.71, 39.25, 49.87, 25.24, 44.97, 35.9, 26.66, 36.78, 41.52, 22.22, 33.2, 39.68, 25.61, 83.99, 15.05, 8.38, 18.18, 27.15, 24.26, 105.17, 11.76, 70.45, 95.07, 112.33, 27.51, 45.92, 54.04, 103.98, 6.11, 99.51, 20.44, 76.6, 10.02, 41.45, 96.26, 85.61, 78.87, 22.25, 74.53, 59.07, 47.8, 5.68, 79.39, 74.07, 50.76, 67.82, 70.84, 50.59, 34.58, 38.72, 54.9, 67.92, 58.45, 59.34, 54.54, 19.03, 26.26, 52.86, 32.05, 55.95, 56.67, 50.43, 4.24, 49.11, 49.57, 50.49, 63.05, 49.24, 0.92, 31.36, 40.3, 116.64, 31.92, 19.98, 24.62, 18.54, 120, 17.62, 5.32, 2.36, 5.72, 15.28, 95.07, 4.96, 28.89, 3.25, 0.92, 18.77, 57.73, 14.39, 5.12, 4.99, 33.17, 50.53, 5.72, 8.02, 34.02, 1.44, 36.95, 60.75, 37.44, 23.07, 19.85, 31.85, 8.61, 42.27, 15.25, 14.56, 9.96, 8.94, 32.67, 2.07, 22.78, 22.35), cens = c(0L, 0L, NA, 0L, 0L, 1L, 0L, 0L, NA, 0L, NA, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, NA, 1L, 1L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, NA, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, NA, 1L, 0L, 1L, 1L, 0L, 0L, 1L, 1L, 1L, 0L, 1L, 0L, NA, 1L, 1L, 1L, 0L, 0L, 0L, NA, 0L, NA, NA, 1L, 0L, 0L, 1L, 1L, 0L, 0L, 1L, NA, NA, 0L, 0L, 0L, 0L, 1L, NA, 0L, 1L, 0L, 0L, 0L, NA, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, NA, 1L, 1L, 1L, 1L, NA, 0L, NA, 1L, 1L, 0L, 1L, 0L, 1L, 1L, NA, 1L, 0L, 1L, 1L, 0L)), .Names = c("time", "cens"), class = "data.frame", row.names = c(NA, -177L)) data &lt;- replicate(10000, abs(rnorm(177))) Genedata &lt;- data.frame(Genedata, data) </code></pre> <p>My approach is the reshape the data using the <code>melt</code> function in the reshape2 package, then use <code>dlply</code> from the plyr package to run the model on each subset.</p> <pre><code>ibrary(maxstat) library(plyr) library(reshape2) system.time({ dat.m &lt;- melt(Genedata, id.vars=1:3) maxstat.estimate &lt;- dlply(dat.m, .(variable), function(DF) { maxstat.test(Surv(time, cens) ~ value, smethod="LogRank", pmethod="Lau94", data=DF)})}) </code></pre> <p>Some timings from this approach are as follows:</p> <pre><code>## 1000 variables ## user system elapsed ## 11.613 0.090 11.780 ## 2000 variables ## user system elapsed ## 23.847 0.060 24.043 ## 3000 variables ## user system elapsed ## 35.316 0.093 35.905 ## 10000 variables ## user system elapsed ## 120.654 0.594 122.140 </code></pre> <p>showing that system time is roughly a linear function of the number of variables.</p> <p>The full 43000 variables takes just over 10 minutes, just enough time for a cup of coffee:</p> <pre><code>## 43000 ## user system elapsed ## 612.587 3.134 630.464 </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload