Note that there are some explanatory texts on larger screens.

plurals
  1. POJoining a set of ordered-integer yielding Python iterators
    primarykey
    data
    text
    <p>Here is a seemingly simple problem: given a list of iterators that yield sequences of integers in ascending order, write a concise generator that yields only the integers that appear in every sequence.</p> <p>After reading a few papers last night, I decided to hack up a completely minimal full text indexer in Python, <a href="http://code.google.com/p/ghetto-fts/" rel="noreferrer">as seen here</a> (though that version is quite old now).</p> <p>My problem is with the <code>search()</code> function, which must iterate over each posting list and yield only the document IDs that appear on every list. As you can see from the link above, my current non-recursive 'working' attempt is terrible.</p> <p><b>Example</b>:</p> <pre><code>postings = [[1, 100, 142, 322, 12312], [2, 100, 101, 322, 1221], [100, 142, 322, 956, 1222]] </code></pre> <p>Should yield:</p> <pre><code>[100, 322] </code></pre> <p>There is at least one elegant recursive function solution to this, but I'd like to avoid that if possible. However, a solution involving nested generator expressions, <code>itertools</code> abuse, or any other kind of code golf is more than welcome. :-)</p> <p>It should be possible to arrange for the function to only require as many steps as there are items in the smallest list, and without sucking the entire set of integers into memory. In future, these lists may be read from disk, and larger than available RAM.</p> <p>For the past 30 minutes I've had an idea on the tip of my tongue, but I can't quite get it into code. Remember, this is just for fun!</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload