Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>If your spreadsheet is full of formulas, you might need to convert those to values to get them to be read in easily. Otherwise, I would suggest using a <a href="http://cwestblog.com/2013/04/12/batch-excel-to-csv-converter-application/" rel="nofollow">tool like this one</a> (among others out there) to convert all the sheets in a workbook to CSV files and work from there.</p> <p>If you've gotten that far, here's something that can be tried for the "reshaping" part of your question. Here, we'll assume that "A" actually represents a CSV file, the contents of which are the six lines shown as sample data in your question:</p> <pre><code>## Create some sample data A &lt;- tempfile() writeLines(sep="\n", con = A, text = c("time, USA, USA, USA", ", MD, CA, PX", ", pork, peas, nuts", "jan-11, 4, 2, 2", "feb-11, 4, 9, 3", "mar-11, 8, 8, 3")) </code></pre> <p>The first thing I would do is read in the headers and the data separately. To read the headers separately, use <code>nrows</code> to specify the number of rows that contain the header information. To read the data separately, specify <code>skip</code> to skip the header rows.</p> <pre><code>B &lt;- read.csv(A, header = FALSE, skip = 3, strip.white = TRUE) Bnames &lt;- read.csv(A, header = FALSE, nrows = 3, strip.white = TRUE) </code></pre> <p>Use <code>apply</code> to <code>paste</code> the header rows together to form the names for the resulting <code>data.frame</code>:</p> <pre><code>names(B) &lt;- apply(Bnames, 2, function(x) paste(x[x != ""], collapse = "_")) B # time USA_MD_pork USA_CA_peas USA_PX_nuts # 1 jan-11 4 2 2 # 2 feb-11 4 9 3 # 3 mar-11 8 8 3 </code></pre> <p>Now comes the part of converting the data from a "wide" to a "long" format. There are many ways to do this, some using base R too, but the most direct is to use <code>melt</code> and <code>colsplit</code> from the "reshape2" package:</p> <pre><code>library(reshape2) BL &lt;- melt(B, id.vars="time") cbind(BL[c("time", "value")], colsplit(BL$variable, "_", c("country", "state", "product"))) # time value country state product # 1 jan-11 4 USA MD pork # 2 feb-11 4 USA MD pork # 3 mar-11 8 USA MD pork # 4 jan-11 2 USA CA peas # 5 feb-11 9 USA CA peas # 6 mar-11 8 USA CA peas # 7 jan-11 2 USA PX nuts # 8 feb-11 3 USA PX nuts # 9 mar-11 3 USA PX nuts </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload