Note that there are some explanatory texts on larger screens.

plurals
  1. POHow to do a basic left outer join with data.table in R?
    primarykey
    data
    text
    <p>I have a data.table of a and b that I've partitioned into <code>below</code> with b &lt; .5 and <code>above</code> with b > .5:</p> <pre><code>DT = data.table(a=as.integer(c(1,1,2,2,3,3)), b=c(0,0,0,1,1,1)) above = DT[DT$b &gt; .5] below = DT[DT$b &lt; .5, list(a=a)] </code></pre> <p>I'd like to do a left outer join between <code>above</code> and <code>below</code>: for each <code>a</code> in <code>above</code>, count the number of rows in <code>below</code>. This is equivalent to the following in SQL:</p> <pre><code>with dt as (select 1 as a, 0 as b union select 1, 0 union select 2, 0 union select 2, 1 union select 3, 1 union select 3, 1), above as (select a, b from dt where b &gt; .5), below as (select a, b from dt where b &lt; .5) select above.a, count(below.a) from above left outer join below on (above.a = below.a) group by above.a; a | count ---+------- 3 | 0 2 | 1 (2 rows) </code></pre> <p>How do I accomplish the same thing with data.tables? This is what I tried so far:</p> <pre><code>&gt; key(below) = 'a' &gt; below[above, list(count=length(b))] a count [1,] 2 1 [2,] 3 1 [3,] 3 1 &gt; below[above, list(count=length(b)), by=a] Error in eval(expr, envir, enclos) : object 'b' not found &gt; below[, list(count=length(a)), by=a][above] a count b [1,] 2 1 1 [2,] 3 NA 1 [3,] 3 NA 1 </code></pre> <p>I should also be more specific in that I already tried <code>merge</code> but that blows through the memory on my system (and the dataset takes only about 20% of my memory).</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload