Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>I am guessing that your problem is something like...</p> <p>"Given a list of unlabeled examples, sort the list by how much the predictive accuracy of the model would improve if the user labelled the example and added it to the training set."</p> <p>If this is the case, I don't think mutual information is the right thing to use because you can't calculate MI between two instances. The definition of MI is in terms of random variables and an individual instance isn't a random variable, it's just a value.</p> <p>The features and the class label can be though of as random variables. That is, they have a distribution of values over the whole data set. You can calculate the mutual information between two features, to see how 'redundant' one feature is given the other one, or between a feature and the class label, to get an idea of how much that feature might help prediction. This is how people usually use mutual information in a supervised learning problem. </p> <p>I think ferdystschenko's suggestion that you look at active learning methods is a good one.</p> <hr> <p>In response to Grundlefleck's comment, I'll go a bit deeper into terminology by using his idea of a Java object analogy...</p> <p>Collectively, we have used the term 'instance', 'thing', 'report' and 'example' to refer to the object being clasified. Let's think of these things as instances of a Java class (I've left out the boilerplate constructor):</p> <pre><code> class Example { String f1; String f2; } Example e1 = new Example("foo", "bar"); Example e2 = new Example("foo", "baz"); </code></pre> <p>The usual terminology in machine learning is that e1 is an <em>example</em>, that all examples have two <em>features</em> f1 and f2 and that for e1, f1 takes the value 'foo' and f2 takes the value 'bar'. A collection of examples is called a <em>data set</em>.</p> <p>Take all the values of f1 for all examples in the data set, this is a list of strings, it can also be thought of as a distribution. We can think of the feature as a <em>random variable</em> and that each value in the list is a sample taken from that random variable. So we can, for example, calculate the MI between f1 and f2. The pseudocode would be something like: <code><pre> mi = 0 for each value x taken by f1: { sum = 0 for each value y taken by f2: { p_xy = number of examples where f1=x and f2=y p_x = number of examples where f1=x p_y = number of examples where f2=y sum += p_xy * log(p_xy/(p_x*p_y)) } mi += sum } </code></pre></p> <p>However you can't calculate MI between e1 and e2, it's just not defined that way.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload