Note that there are some explanatory texts on larger screens.

plurals
  1. PONLTK classify interface using trained classifier
    text
    copied!<p>I have this little chunk of code I found <a href="http://streamhacker.com/2010/05/10/text-classification-sentiment-analysis-naive-bayes-classifier/" rel="nofollow">here</a>:</p> <pre><code>import nltk.classify.util from nltk.classify import NaiveBayesClassifier from nltk.corpus import movie_reviews from nltk.corpus import stopwords def word_feats(words): return dict([(word, True) for word in words]) negids = movie_reviews.fileids('neg') posids = movie_reviews.fileids('pos') negfeats = [(word_feats(movie_reviews.words(fileids=[f])), 'neg') for f in negids] posfeats = [(word_feats(movie_reviews.words(fileids=[f])), 'pos') for f in posids] negcutoff = len(negfeats)*3/4 poscutoff = len(posfeats)*3/4 trainfeats = negfeats[:negcutoff] + posfeats[:poscutoff] testfeats = negfeats[negcutoff:] + posfeats[poscutoff:] print 'train on %d instances, test on %d instances' % (len(trainfeats), len(testfeats)) classifier = NaiveBayesClassifier.train(trainfeats) print 'accuracy:', nltk.classify.util.accuracy(classifier, testfeats) classifier.show_most_informative_features() </code></pre> <p>But how can I classify a random word that might be in the corpus.</p> <pre><code>classifier.classify('magnificent') </code></pre> <p>Doesn't work. Does it need some kind of object?</p> <p>Thank you very much.</p> <p>EDIT: Thanks to @unutbu's feedback and some digging <a href="http://nltk.googlecode.com/svn/trunk/doc/api/nltk.probability.ProbDistI-class.html#samples" rel="nofollow">here</a> and reading the comments on the original post the following yields 'pos' or 'neg' for this code (this one's a 'pos')</p> <pre><code>print(classifier.classify(word_feats(['magnificent']))) </code></pre> <p>and this yields the evaluation of the word for 'pos' or 'neg'</p> <pre><code>print(classifier.prob_classify(word_feats(['magnificent'])).prob('neg')) </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload