Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>Ben's approach works great, but here is another approach using <code>`strapplyc</code>, <code>gsubfn</code> or <code>strapply</code> from the <a href="http://gsubfn.googlecode.com" rel="nofollow">gsubfn package</a>. </p> <p>First read in the data and set <code>col.names</code>, the separator and the pattern to use:</p> <pre><code>r &lt;- readLines(textConnection( "Ant farm 45 67 89 Cookie 5 43 21 Mouse hole 5 87 32 Ferret 3 56 87")) library(gsubfn) col.names &lt;- c("group", "x1", "x2", "x3") sep &lt;- "," # if comma can appear in fields use something else pat &lt;- "^(.*) +(\\d+) +(\\d+) +(\\d+) *$" </code></pre> <p><strong>1) gsubfn</strong> </p> <pre><code>tmp &lt;- sapply(strapplyc(r, pat), paste, collapse = sep) read.table(text = tmp, col.names = col.names, as.is = TRUE, sep = sep) </code></pre> <p><strong>2) strapplyc</strong> Alternately the same code but the last two statement are replaced with:</p> <pre><code>tmp &lt;- gsubfn(pat, ... ~ paste(..., sep = sep), r) read.table(text = tmp, col.names = col.names, as.is = TRUE, sep = sep) </code></pre> <p><strong>3) strapply</strong>. This one and the variation that follows do not require that <code>sep</code> be defined.</p> <pre><code>library(data.table) tmp &lt;- strapply(r, pat, ~ data.table( group = group, x1 = as.numeric(x1), x2 = as.numeric(x2), x3 = as.numeric(x3) )) rbindlist(tmp) </code></pre> <p><strong>3a)</strong> This one involves some extra manipulation so we might favor one of the other solutions instead but for completeness here it is. The <code>combine=list</code> prevents the individual outputs from being munged and the <code>simplify=c</code> removes the extra layer that <code>combine=list</code> added. Finally we <code>rbind</code> everything together.</p> <pre><code>tmp &lt;- strapply(r, pat, ~ data.frame( group = group, x1 = as.numeric(x1), x2 = as.numeric(x2), x3 = as.numeric(x3), stringsAsFactors = FALSE ), combine = list, simplify = c) do.call(rbind, tmp) </code></pre> <p><strong>4) read.pattern</strong> The development version of the gsubfn package has a new function <a href="https://gsubfn.googlecode.com/svn/trunk/R/read.pattern.R" rel="nofollow">read.pattern</a> that is particularly direct for this type of problem:</p> <pre><code>library(devtools) # source_url source_url("https://gsubfn.googlecode.com/svn/trunk/R/read.pattern.R") # from dev repo read.pattern(text = r, pattern = pat, col.names = col.names, as.is = TRUE) </code></pre> <p><strong>Note:</strong> These approaches have a couple of advantages (though Ben's approach could be modified for these cases as well). This approach takes anything before the last 3 numbers and uses it as the first field, so if the first field has 3 or more words or one of the "words" is a set of digits (e.g. "17 inch ant farm") then it will still work.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload