Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>This join indeed seems to be misspecified. In general, I think, a self join of a table with single column key is probably always misspecified. Consider the following example :</p> <pre><code>&gt; DT A B 1: 1 5 2: 1 6 3: 1 7 4: 2 8 5: 2 9 &gt; setkey(DT,A) </code></pre> <p>There are 2 unique values of A (1 and 2), but they are repeated in the A column. The key is a single column.</p> <pre><code>&gt; DT[DT] # the long error message &gt; DT[DT, allow.cartesian=TRUE] # **each row** of DT is self joined to DT A B B.1 1: 1 5 5 2: 1 6 5 3: 1 7 5 4: 1 5 6 5: 1 6 6 6: 1 7 6 7: 1 5 7 8: 1 6 7 9: 1 7 7 10: 2 8 8 11: 2 9 8 12: 2 8 9 13: 2 9 9 </code></pre> <p>Is this really the result you need? More likely, the query needs to be changed by adding more columns to the key, doing a <code>by</code> instead, not doing a self join or the hints in the error message.</p> <p>More information about what you need to achieve after the merge (bigger picture) is likely to help.</p> <h2>example of "including j and dropping by (by-without-by) so that j runs for each group to avoid the large allocation" (see error message in question)</h2> <p>The example now in question (covariance) is normally done with matrices rather than data.table. But proceeding anyway to quickly demonstrate, just using it as example data ...</p> <pre><code>require(data.table) country = fread(" Country Product 1 5 1 6 1 7 2 6 2 7 2 8 2 9 ") prod = fread(" Prod1 Prod2 Covariance 5 5 .4 5 6 .5 5 7 .6 5 8 -.3 5 9 -.1 6 6 .3 6 7 .2 6 8 .4 6 9 -.2 7 7 .2 7 8 .1 7 9 .3 8 8 .1 8 9 .6 9 9 .01 ") </code></pre> <p>.</p> <pre><code>country Country Product 1: 1 5 2: 1 6 3: 1 7 4: 2 6 5: 2 7 6: 2 8 7: 2 9 prod Prod1 Prod2 Covariance 1: 5 5 0.40 2: 5 6 0.50 3: 5 7 0.60 4: 5 8 -0.30 5: 5 9 -0.10 6: 6 6 0.30 7: 6 7 0.20 8: 6 8 0.40 9: 6 9 -0.20 10: 7 7 0.20 11: 7 8 0.10 12: 7 9 0.30 13: 8 8 0.10 14: 8 9 0.60 15: 9 9 0.01 </code></pre> <p>.</p> <pre><code>setkey(country,Country) country[country,{print(.SD);print(i.Product)}] # included j to demonstrate j running for each row of i. Just printing to demo. Product 1: 5 2: 6 3: 7 [1] 5 Product 1: 5 2: 6 3: 7 [1] 6 Product 1: 5 2: 6 3: 7 [1] 7 Product 1: 6 2: 7 3: 8 4: 9 [1] 6 Product 1: 6 2: 7 3: 8 4: 9 [1] 7 Product 1: 6 2: 7 3: 8 4: 9 [1] 8 Product 1: 6 2: 7 3: 8 4: 9 [1] 9 Empty data.table (0 rows) of 2 cols: Country,Product </code></pre> <p>.</p> <pre><code>setkey(prod,Prod1,Prod2) country[country,prod[J(i.Product,Product),Covariance,nomatch=0]] Country Prod1 Prod2 Covariance 1: 1 5 5 0.40 2: 1 5 6 0.50 3: 1 5 7 0.60 4: 1 6 6 0.30 5: 1 6 7 0.20 6: 1 7 7 0.20 7: 2 6 6 0.30 8: 2 6 7 0.20 9: 2 6 8 0.40 10: 2 6 9 -0.20 11: 2 7 7 0.20 12: 2 7 8 0.10 13: 2 7 9 0.30 14: 2 8 8 0.10 15: 2 8 9 0.60 16: 2 9 9 0.01 country[country, prod[J(i.Product,Product),Covariance,nomatch=0][ ,mean(Covariance),by=Country] Country V1 1: 1 0.3666667 2: 2 0.2010000 </code></pre> <p>This doesn't match the desired result due to not doubling the off diagonal. But hopefully this is enough to demonstrate that particular suggestion in the error message in the question and you can take it from here. Or use <code>matrix</code> rather than <code>data.table</code> for covariance type work.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload