Note that there are some explanatory texts on larger screens.

plurals
  1. POPython: Dictionary of list of lists
    primarykey
    data
    text
    <pre><code>def makecounter(): return collections.defaultdict(int) class RankedIndex(object): def __init__(self): self._inverted_index = collections.defaultdict(list) self._documents = [] self._inverted_index = collections.defaultdict(makecounter) def index_dir(self, base_path): num_files_indexed = 0 allfiles = os.listdir(base_path) self._documents = os.listdir(base_path) num_files_indexed = len(allfiles) docnumber = 0 self._inverted_index = collections.defaultdict(list) docnumlist = [] for file in allfiles: self.documents = [base_path+file] #list of all text files f = open(base_path+file, 'r') lines = f.read() tokens = self.tokenize(lines) docnumber = docnumber + 1 for term in tokens: if term not in sorted(self._inverted_index.keys()): self._inverted_index[term] = [docnumber] self._inverted_index[term][docnumber] +=1 else: if docnumber not in self._inverted_index.get(term): docnumlist = self._inverted_index.get(term) docnumlist = docnumlist.append(docnumber) f.close() print '\n \n' print 'Dictionary contents: \n' for term in sorted(self._inverted_index): print term, '-&gt;', self._inverted_index.get(term) return num_files_indexed return 0 </code></pre> <p>I get index error on executing this code: list index out of range.</p> <p>The above code generates a dictionary index that stores the 'term' as a key and the document numbers in which the term occurs as a list. For ex: if the term 'cat' occurs in documents 1.txt, 5.txt and 7.txt the dictionary will have: cat &lt;- [1,5,7]</p> <p>Now, I have to modify it to add term frequency, so if the word cat occurs twice in document 1, thrice in document 5 and once in document 7: expected result: term &lt;-[[docnumber, term freq], [docnumber,term freq]] &lt;--list of lists in a dict!!! cat &lt;- [[1,2],[5,3],[7,1]]</p> <p>I played around with the code, but nothing works. I have no clue to modify this datastructure to achieve the above.</p> <p>Thanks in advance.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload