Note that there are some explanatory texts on larger screens.

plurals
  1. POstrsplit into data.frame with incomplete input
    primarykey
    data
    text
    <p>I try to split a vector of strings into a data.frame object and for a fixed order this isn't a problem (e.g. like written <a href="https://stackoverflow.com/questions/7069076/split-column-at-delimiter-in-data-frame">here</a>), but in my particular case the columns for the future data-frame are not complete in the string objects. This is how the output should look like for an toy input:</p> <pre><code>input &lt;- c("an=1;bn=3;cn=45", "bn=3.5;cn=76", "an=2;dn=5") res &lt;- do.something(input) &gt; res an bn cn dn [1,] 1 3 45 NA [2,] NA 3.5 76 NA [3,] 2 NA NA 5 </code></pre> <p>I am looking now for a function <code>do.something</code>that can do that in a efficient way. My naive solution at the moment would be to loop over the input objects, <code>strsplit</code> those for <code>;</code> then <code>strsplit</code> them again for <code>=</code> and then fill the <code>data.frame</code> result by result. Is there any way to do that more R-alike? I am afraid doing that element by element would take quite a long time for a long vector <code>input</code>. </p> <p>EDIT: Just for completeness, my naive solution looks like this:</p> <pre><code> do.something &lt;- function(x){ temp &lt;- strsplit(x,";") temp2 &lt;- sapply(temp,strsplit,"=") ul.temp2 &lt;- unlist(temp2) label &lt;- sort(unique(ul.temp2[seq(1,length(ul.temp2),2)])) res &lt;- data.frame(matrix(NA, nrow = length(x), ncol = length(label))) colnames(res) &lt;- label for(i in 1:length(temp)){ for(j in 1:length(label)){ curInfo &lt;- unlist(temp2[[i]]) if(sum(is.element(curInfo,label[j]))&gt;0){ res[i,j] &lt;- curInfo[which(curInfo==label[j])+1] } } } res } </code></pre> <p>EDIT2: Unfortunately my large input data looks like this (entries without '=' possible):</p> <pre><code>input &lt;- c("an=1;bn=3;cn=45", "an;bn=3.5;cn=76", "an=2;dn=5") </code></pre> <p>so I cannot compare the given answers to my problem at hand. My naive solution for that is</p> <pre><code>do.something &lt;- function(x){ temp &lt;- strsplit(x,";") tempNames &lt;- sort(unique(sapply(strsplit(unlist(temp),"="),"[",1))) res &lt;- data.frame(matrix(NA, nrow = length(x), ncol = length(tempNames))) colnames(res) &lt;- tempNames for(i in 1:length(temp)){ curSplit &lt;- strsplit(unlist(temp[[i]]),"=") curNames &lt;- sapply(curSplit,"[",1) curValues &lt;- sapply(curSplit,"[",2) for(j in 1:length(tempNames)){ if(is.element(colnames(res)[j],curNames)){ res[i,j] &lt;- curValues[curNames==colnames(res)[j]] } } } res } </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload