Note that there are some explanatory texts on larger screens.

plurals
  1. POBinning variable by set number of observations
    primarykey
    data
    text
    <p>Quick question. I am binning a variable in a number of different ways for exploratory data analysis. Let's say I have a variable called <code>var</code> in data.frame <code>df</code>.</p> <pre><code>df$var&lt;-c(1,2,8,9,4,5,6,3,6,9,3,4,5,6,7,8,9,2,3,4,6,1,2,3,7,8,9,0) </code></pre> <p>So far, I've employed the following approaches (code below):</p> <pre><code>#Divide into quartiles df$var_quartile &lt;- with(df, cut(var, breaks=quantile(var, probs=seq(0,1, by=.25)), include.lowest=TRUE)) # Values of var_quartile &gt; [0,3],[0,3],(7.25,9],(7.25,9],(3,5],(3,5],(5,7.25],[0,3],(5,7.25],(7.25,9],[0,3],(3,5],(3,5],(5,7.25],(5,7.25],(7.25,9],(7.25,9],[0,3],[0,3],(3,5],(5,7.25],[0,3],[0,3],[0,3] #Bin into increments of 2 df$var_bin&lt;- cut(df[['var']],2, include.lowest=TRUE, labels=1:2) # Values of var_bin &gt; 1 1 2 2 1 2 2 1 2 2 1 1 2 2 2 2 2 1 1 1 2 1 1 1 2 2 2 1 </code></pre> <p>The last thing that I'd like to do is bin the variable into sections of 10 observations after it has been sorted in chronological order. This is an identical approach to splitting after finding the median (counting up to the middle observation), only I want to count in 10-observation increments.</p> <p>Using my example, this would split <code>var</code> into the following sections: </p> <pre><code>0,1,1,2,2,2,3,3,3,3 4,4,4,5,5,6,6,6,6,7 7,8,8,8,9,9,9 </code></pre> <p><strong>N.B. -- I need to run this operation in very large datasets (usually 3-6 million observations in wide form).</strong></p> <p>How do I do this? Thanks!</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload