Note that there are some explanatory texts on larger screens.

plurals
  1. POPython: eliminate duplicate nested lists
    primarykey
    data
    text
    <p>I am building a tagging model for natural language processing. Initially, the words of a sentence are tagged as a part of speech (like NN for noun), then the rules are applied that divide them into trees which are represented as nested lists. This process iterates many times until you get one node at the top level. I have a master list of all potential trees and I need to eliminate duplicate trees or the whole thing blows up in memory. Here is a small sample of what a list consists of. I need to make sure that each list in the list is unique as each iteration creates many branches.</p> <pre><code>[[('NP', [('PRP', 'I')]), ('VBD', 'ate'), ('DT', 'a'), ('NN', 'steak'), ('IN', 'with'), ('DT', 'a'), ('NN', 'knife'), ('.', '.')] [('PRP', 'I'), ('VP', [('VBD', 'ate')]), ('DT', 'a'), ('NN', 'steak'), ('IN', 'with'), ('DT', 'a'), ('NN', 'knife'), ('.', '.')] [('PRP', 'I'), ('VBD', 'ate'), ('NP', [('DT', 'a')]), ('NN', 'steak'), ('IN', 'with'), ('DT', 'a'), ('NN', 'knife'), ('.', '.')] ...] </code></pre> <p>I thought of using a set but lists aren't hashable. I tried using recursion and it runs out of memory. I thought about converting the lists to strings, using the string as a dictionary key and the list as the value, then iterating over and turning it back into a list again (or keep it as a dictionary?). Does anyone have a less hackish solution? I'm relatively new to Python so please provide an explanation of how your solution works.</p> <p>I should clarify: the nested lists can be indefinitely deep. The tree structure is not known ahead of time but is built on the fly. Trying to build something like this -<a href="http://jos.oxfordjournals.org/content/25/4/345/F23.large.jpg" rel="nofollow">http://jos.oxfordjournals.org/content/25/4/345/F23.large.jpg</a> but in the form of a nested list.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload