Note that there are some explanatory texts on larger screens.

plurals
  1. POMutual Information and Chi Square relationship
    text
    copied!<p>I've used the following code to compute the Mutual Information and Chi Square values for feature selection in Sentiment Analysis.</p> <pre><code>MI = (N11/N)*math.log((N*N11)/((N11+N10)*(N11+N01)),2) + (N01/N)*math.log((N*N01)/((N01+N00)*(N11+N01)),2) + (N10/N)*math.log((N*N10)/((N10+N11)*(N00+N10)),2) + (N00/N)*math.log((N*N00)/((N10+N00)*(N01+N00)),2) </code></pre> <p>where N11,N01,N10 and N00 are the observed frequencies of the two features in my data set.</p> <p>NOTE : I am trying to calculate the mutual information and Chi Squared values between 2 features and not the mutual information between a particular feature and a class. I'm doing this so I'll know if the two features are related in any way. </p> <p>The Chi Squared formula I've used is : </p> <pre><code>E00 = N*((N00+N10)/N)*((N00+N01)/N) E01 = N*((N01+N11)/N)*((N01+N00)/N) E10 = N*((N10+N11)/N)*((N10+N00)/N) E11 = N*((N11+N10)/N)*((N11+N01)/N) chi = ((N11-E11)**2)/E11 + ((N00-E00)**2)/E00 + ((N01-E01)**2)/E01 + ((N10-E10)**2)/E10 </code></pre> <p>Where E00,E01,E10,E11 are the expected frequencies.</p> <p>By the definition of Mutual Information, a low value should mean that one feature does not give me information about the other and by the definition of Chi Square, a low value of Chi Square means that the two features must be independent.</p> <p>But for a certain two features, i got a Mutual information score of 0.00416 and a Chi Square value of 4373.9. This doesn't make sense to me since the Mutual information score indicates the features aren't closely related but the Chi Square value seems to be high enough to indicate they aren't independent either. I think I'm going wrong with my interpretation </p> <p>The values I got for the observed frequencies are </p> <pre><code>N00 = 312412 N01 = 276116 N10 = 51120 N11 = 68846 </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload