Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <blockquote> <p>An explanation that uses the words 'error', 'summation', or 'permutated' would be less helpful then a simpler explanation that didn't involve any discussion of how random forests works.</p> <p>Like if I wanted someone to explain to me how to use a radio, I wouldn't expect the explanation to involve how a radio converts radio waves into sound.</p> </blockquote> <p>How would you explain what the numbers in WKRP 100.5 FM "mean" without going into the pesky technical details of wave frequencies? Frankly parameters and related performance issues with Random Forests are difficult to get your head around even if you understand some technical terms.</p> <p>Here's my shot at some answers:</p> <blockquote> <p>-mean raw importance score of variable x for class 0</p> <p>-mean raw importance score of variable x for class 1</p> </blockquote> <p>Simplifying from the Random Forest <a href="http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm#varimp" rel="noreferrer">web page</a>, raw importance score measures how much more helpful than random a particular predictor variable is in successfully classifying data.</p> <blockquote> <p>-MeanDecreaseAccuracy</p> </blockquote> <p>I think this is only in the <a href="http://cran.r-project.org/web/packages/randomForest/index.html" rel="noreferrer">R module</a>, and I believe it measures how much inclusion of this predictor in the model reduces classification error.</p> <blockquote> <p>-MeanDecreaseGini</p> </blockquote> <p><a href="http://en.wikipedia.org/wiki/Gini_coefficient" rel="noreferrer">Gini</a> is defined as "inequity" when used in describing a society's distribution of income, or a measure of "node impurity" in tree-based classification. A low Gini (i.e. higher descrease in Gini) means that a particular predictor variable plays a greater role in partitioning the data into the defined classes. It's a hard one to describe without talking about the fact that data in classification trees are split at individual nodes based on values of predictors. I'm not so clear on how this translates into better performance.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload