Note that there are some explanatory texts on larger screens.

plurals
  1. POpaired t-test crashes apply-loop (edited)
    primarykey
    data
    text
    <p><strong>In response to the helpful comments, I have edited the original question (where I had assumed that a for-loop and an apply-loop give different results).</strong></p> <p>I am using R to run a large number of 2-group t-tests, using input from a delimited table. Following recommendations from here and elsewhere, I tried either 'for-loops' and 'apply' to accomplish that. For 'normal' t.test, both work nicely and give the same results. However, for a paired t-test, the for-look appears to works while the apply-loop does not. Later, i found out that both loops suffer from the same problem (see below) but the for-loops deals more gracefully with the situation (only one cycle of the loop returns an invalid result) while the apply-loop fails altogether.</p> <p>My input file looks like this: (the first line is a header line, the data lines have a name, 4 datapoints for group 1 and 4 datapoints for group 2):</p> <pre><code>header g1.1 g1.2 g1.3 g1.4 g2.1 g2.2 g2.3 g2.4 name1 0 0.5 -0.2 -0.2 -0.1 0.4 -0.3 -0.3 name2 23.2 24.4 24.5 27.2 15.5 16.5 17.7 20.0 name3 ..... </code></pre> <p>and so on (overall ~50000 lines). The first data line (starting with name19 turned out to be the culprit.</p> <p>This is the for-loop version that works better (failes on the problematic line but correctly deals with all other lines):</p> <pre><code>table &lt;- read.table('ttest_in.txt',head=1,sep='\t') for(i in 1:nrow(table)) { g1&lt;-as.numeric((table)[i,2:5]) g2&lt;-as.numeric((table)[i,6:9]) pv &lt;- t.test(g1,g2,paired=TRUE)$p.value } </code></pre> <p>This is the 'apply' version that causes problems</p> <pre><code>table &lt;- read.table('ttest_in.txt',head=1,sep='\t') pv.list &lt;- apply(table[,2:9],1,function(x){t.test(x[1:4],x[5:8],paired=TRUE)$p.value}) </code></pre> <p>One of the ~50000 data lines is problematic in that the differences of all pairwise comparions are identical, which in a paired t-test results in an undefined p-value (essentially zero). The apply loop crashes with the error 'data are essentially constant'. To me (as an R newbie) it does not seem to be a good idea to crash the entire script just because the t.test doesn't like one piece of data. In the for-loop, this data line also results in an error message but the loop continues and all the other t-tests give correct results.</p> <p>Did I do something fundamentally wrong? This behaviour a essentially prohibits the usage of apply-loops for this kind of batch analysis. Or is there a standard way to circumvent this problem. Why doesn't the t-test just return something invalid for that particular p-value instead of bailing out?</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload