Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p><code>merge()</code> is a suitable tool for this job. Here is an example:</p> <pre><code>set.seed(1) d1 &lt;- data.frame(lat = 1:10, lon = 1:10, V2 = runif(10), V4 = rnorm(10)) d2 &lt;- data.frame(lat = 1:10, lon = 1:10, V1 = runif(10), V3 = rnorm(10)) ## merge the data using `lat` and `lon` res &lt;- merge(d1, d2, sort = FALSE) ## `sort = FALSE` stops R reordering rows ## get columns in right order res &lt;- res[, c(1:2, order(colnames(res[, -(1:2)])) + 2)] </code></pre> <p>Which gives:</p> <pre><code>&gt; res lat lon V1 V2 V3 V4 1 1 1 0.4820801 0.26550866 0.91897737 -0.8204684 2 2 2 0.5995658 0.37212390 0.78213630 0.4874291 3 3 3 0.4935413 0.57285336 0.07456498 0.7383247 4 4 4 0.1862176 0.90820779 -1.98935170 0.5757814 5 5 5 0.8273733 0.20168193 0.61982575 -0.3053884 6 6 6 0.6684667 0.89838968 -0.05612874 1.5117812 7 7 7 0.7942399 0.94467527 -0.15579551 0.3898432 8 8 8 0.1079436 0.66079779 -1.47075238 -0.6212406 9 9 9 0.7237109 0.62911404 -0.47815006 -2.2146999 10 10 10 0.4112744 0.06178627 0.41794156 1.1249309 </code></pre> <p>Update based on revised Q:</p> <pre><code>## dummy data set.seed(1) df3 &lt;- data.frame(matrix(runif(60), ncol = 6)) names(df3) &lt;- paste("df3Var", 1:6, sep = "") df3 &lt;- cbind.data.frame(lat = 1:10, lon = 1:10, df3) df4 &lt;- data.frame(matrix(runif(30), ncol = 3)) names(df4) &lt;- paste("df4Var", 1:3, sep = "") df4 &lt;- cbind.data.frame(lat = 1:10, lon = 1:10, df4) ## merge res2 &lt;- merge(df3, df4, sort = FALSE) </code></pre> <p>This gives:</p> <pre><code>&gt; head(res2) lat lon df3Var1 df3Var2 df3Var3 df3Var4 df3Var5 df3Var6 1 1 1 0.2655087 0.2059746 0.9347052 0.4820801 0.8209463 0.47761962 2 2 2 0.3721239 0.1765568 0.2121425 0.5995658 0.6470602 0.86120948 3 3 3 0.5728534 0.6870228 0.6516738 0.4935413 0.7829328 0.43809711 4 4 4 0.9082078 0.3841037 0.1255551 0.1862176 0.5530363 0.24479728 5 5 5 0.2016819 0.7698414 0.2672207 0.8273733 0.5297196 0.07067905 6 6 6 0.8983897 0.4976992 0.3861141 0.6684667 0.7893562 0.09946616 df4Var1 df4Var2 df4Var3 1 0.9128759 0.3390729 0.4346595 2 0.2936034 0.8394404 0.7125147 3 0.4590657 0.3466835 0.3999944 4 0.3323947 0.3337749 0.3253522 5 0.6508705 0.4763512 0.7570871 6 0.2580168 0.8921983 0.2026923 &gt; names(res2) [1] "lat" "lon" "df3Var1" "df3Var2" "df3Var3" "df3Var4" "df3Var5" [8] "df3Var6" "df4Var1" "df4Var2" "df4Var3" </code></pre> <p>OK, so now note the ordering. Assume we want to take variables in groups of 2 from <code>df3</code> with 1 variable from <code>df4</code> and within each of <code>df3</code> and <code>df4</code> the variables are in the correct order within themselves. For this we need to create an index vector <code>ord</code> that is:</p> <pre><code>&gt; ord [1] 1 2 7 3 4 8 5 6 9 </code></pre> <p>which we then add <code>2</code> too (to cover the <code>lat</code> and <code>lon</code> columns in the merged data frame)</p> <pre><code>&gt; ord + 2 [1] 3 4 9 5 6 10 7 8 11 </code></pre> <p>Once you have the sequence, we just need a way to use R's vectorised tools and a tiny bit of math to produce the sequence we want. I build the index up in two stages; i) first I work out where the columns <code>(1:6) + 2</code> of the merged data frame should be in <code>ord</code>, and then ii) I fill in the remaining spaces with the indexes in the merged data frame of the columns from the second data frame.</p> <pre><code>ord &lt;- numeric(length = sum(ncol(df3), ncol(df4)) - 4) ngrps &lt;- 3 ningrps &lt;- 2 ## i) want &lt;- rep(seq_len(ningrps), ngrps) + rep(seq(from = 0, by = 3, length = prod(ngrps, ningrps) / 2), each = ningrps) ord[want] &lt;- seq_len(prod(ngrps, ningrps)) ## ii) want &lt;- ngrps * seq_len(ngrps) ord[want] &lt;- seq(to = sum(ncol(df3), ncol(df4)) - 4, by = 1, length = ngrps) res3 &lt;- res2[, c(1:2, ord+2)] </code></pre> <p>That gives:</p> <pre><code>&gt; head(res3) lat lon df3Var1 df3Var2 df4Var1 df3Var3 df3Var4 df4Var2 df3Var5 1 1 1 0.2655087 0.2059746 0.9128759 0.9347052 0.4820801 0.3390729 0.8209463 2 2 2 0.3721239 0.1765568 0.2936034 0.2121425 0.5995658 0.8394404 0.6470602 3 3 3 0.5728534 0.6870228 0.4590657 0.6516738 0.4935413 0.3466835 0.7829328 4 4 4 0.9082078 0.3841037 0.3323947 0.1255551 0.1862176 0.3337749 0.5530363 5 5 5 0.2016819 0.7698414 0.6508705 0.2672207 0.8273733 0.4763512 0.5297196 6 6 6 0.8983897 0.4976992 0.2580168 0.3861141 0.6684667 0.8921983 0.7893562 df3Var6 df4Var3 1 0.47761962 0.4346595 2 0.86120948 0.7125147 3 0.43809711 0.3999944 4 0.24479728 0.3253522 5 0.07067905 0.7570871 6 0.09946616 0.2026923 </code></pre> <p>which is the ordering you wanted. Now we can cook that into a little function:</p> <pre><code>myMerge &lt;- function(x, y, ngrps, ningrps, ...) { out &lt;- merge(x, y, ...) ncols &lt;- ncol(out) - 2 ord &lt;- numeric(length = ncols) want &lt;- rep(seq_len(ningrps), ngrps) + rep(seq(from = 0, by = ngrps, length = prod(ngrps, ningrps) / 2), each = ningrps) ord[want] &lt;- seq_len(prod(ngrps, ningrps)) want &lt;- ngrps * seq_len(ngrps) ord[want] &lt;- seq(to = ncols, by = 1, length = ngrps) out &lt;- out[, c(1:2, ord+2)] out } </code></pre> <p>Which when used on <code>df3</code> and <code>df4</code> above gives:</p> <pre><code>&gt; myMerge(df3, df4, ngrps = 3, ningrps = 2, sort = FALSE) lat lon df3Var1 df3Var2 df4Var1 df3Var3 df3Var4 df4Var2 1 1 1 0.26550866 0.2059746 0.91287592 0.93470523 0.4820801 0.3390729 2 2 2 0.37212390 0.1765568 0.29360337 0.21214252 0.5995658 0.8394404 3 3 3 0.57285336 0.6870228 0.45906573 0.65167377 0.4935413 0.3466835 4 4 4 0.90820779 0.3841037 0.33239467 0.12555510 0.1862176 0.3337749 5 5 5 0.20168193 0.7698414 0.65087047 0.26722067 0.8273733 0.4763512 6 6 6 0.89838968 0.4976992 0.25801678 0.38611409 0.6684667 0.8921983 7 7 7 0.94467527 0.7176185 0.47854525 0.01339033 0.7942399 0.8643395 8 8 8 0.66079779 0.9919061 0.76631067 0.38238796 0.1079436 0.3899895 9 9 9 0.62911404 0.3800352 0.08424691 0.86969085 0.7237109 0.7773207 10 10 10 0.06178627 0.7774452 0.87532133 0.34034900 0.4112744 0.9606180 df3Var5 df3Var6 df4Var3 1 0.8209463 0.47761962 0.4346595 2 0.6470602 0.86120948 0.7125147 3 0.7829328 0.43809711 0.3999944 4 0.5530363 0.24479728 0.3253522 5 0.5297196 0.07067905 0.7570871 6 0.7893562 0.09946616 0.2026923 7 0.0233312 0.31627171 0.7111212 8 0.4772301 0.51863426 0.1216919 9 0.7323137 0.66200508 0.2454885 10 0.6927316 0.40683019 0.1433044 </code></pre> <p>Which is again what you wanted. You could fiddle with the function definition so you don't need to specify both <code>ngrps</code> and <code>ningrps</code> as you can work one out from the other plus the number of columns in <code>df3</code> - 2. But I'll leave that as an exercise for the reader.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload