Note that there are some explanatory texts on larger screens.

plurals
  1. POFreqDist using NLTK
    primarykey
    data
    text
    <p>I'm trying to get a frequency distribution of a set of documents using Python. My code isn't working for some reason and is producing this error:</p> <pre><code>Traceback (most recent call last): File "C:\Documents and Settings\aschein\Desktop\freqdist", line 32, in &lt;module&gt; fd = FreqDist(corpus_text) File "C:\Python26\lib\site-packages\nltk\probability.py", line 104, in __init__ self.update(samples) File "C:\Python26\lib\site-packages\nltk\probability.py", line 472, in update self.inc(sample, count=count) File "C:\Python26\lib\site-packages\nltk\probability.py", line 120, in inc self[sample] = self.get(sample,0) + count TypeError: unhashable type: 'list' </code></pre> <p>Can you help?</p> <p>This is the code so far:</p> <pre><code>import os import nltk from nltk.probability import FreqDist #The stop=words list stopwords_doc = open("C:\\Documents and Settings\\aschein\\My Documents\\stopwords.txt").read() stopwords_list = stopwords_doc.split() stopwords = nltk.Text(stopwords_list) corpus = [] #Directory of documents directory = "C:\\Documents and Settings\\aschein\\My Documents\\comments" listing = os.listdir(directory) #Append all documents in directory into a single 'document' (list) for doc in listing: doc_name = "C:\\Documents and Settings\\aschein\\My Documents\\comments\\" + doc input = open(doc_name).read() input = input.split() corpus.append(input) #Turn list into Text form for NLTK corpus_text = nltk.Text(corpus) #Remove stop-words for w in corpus_text: if w in stopwords: corpus_text.remove(w) fd = FreqDist(corpus_text) </code></pre>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload