Note that there are some explanatory texts on larger screens.

plurals
  1. POQuantiles by factor levels in R
    text
    copied!<p>I have a data frame and I'm trying to create a new variable in the data frame that has the quantiles of a continuous variable <code>var1</code>, for each level of a factor <code>strata</code>.</p> <pre><code># some data set.seed(472) dat &lt;- data.frame(var1 = rnorm(50, 10, 3)^2, strata = factor(sample(LETTERS[1:5], size = 50, replace = TRUE)) ) # function to get quantiles qfun &lt;- function(x, q = 5) { quantile &lt;- cut(x, breaks = quantile(x, probs = 0:q/q), include.lowest = TRUE, labels = 1:q) quantile } </code></pre> <p>I tried using two methods, neither of which produce a usable result. Firstly, I tried using <code>aggregate</code> to apply <code>qfun</code> to each level of <code>strata</code>:</p> <pre><code>qdat &lt;- with(dat, aggregate(var1, list(strata), FUN = qfun)) </code></pre> <p>This returns the quantiles by factor level, but the output is hard to coerce back into a data frame (e.g., using <code>unlist</code> does not line the new variable values up with the correct rows in the data frame).</p> <p>A second approach was to do this in steps:</p> <pre><code>tmp1 &lt;- with(dat, split(var1, strata)) tmp2 &lt;- lapply(tmp1, qfun) tmp3 &lt;- unlist(tmp2) dat$quintiles &lt;- tmp3 </code></pre> <p>Again, this calculates the quantiles correctly for each factor level, but obviously, as with <code>aggregate</code> they aren't in the correct order in the data frame. We can check this by putting the quantile "bins" into the data frame.</p> <pre><code># get quantile bins qfun2 &lt;- function(x, q = 5) { quantile &lt;- cut(x, breaks = quantile(x, probs = 0:q/q), include.lowest = TRUE) quantile } tmp11 &lt;- with(dat, split(var1, strata)) tmp22 &lt;- lapply(tmp11, qfun2) tmp33 &lt;- unlist(tmp22) dat$quintiles2 &lt;- tmp33 </code></pre> <p>Many of the values of <code>var1</code> are outside of the bins of <code>quantile2</code>. I feel like i'm missing something simple. Any suggestions would be greatly appreciated.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload