Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>This probably isn't the best way to solve your problem, but if you want to know how to make it work:</p> <p>The problem here is that your inner <code>for line in f:</code> loop consumes the whole rest of the file—so when you get back to the outer loop, there's nothing left to read. (There's a second problem: When I run your code on your data, <code>len(aline)</code> is always <em>12</em>, not <em>11</em>. But that's a trivial fix.)</p> <p>This isn't specific to files; it's how all iterators work in Python. There are two general ways to deal with this for any iterator, plus one file-specific solution.</p> <p>First, there's <a href="http://docs.python.org/2/library/itertools.html#itertools.tee" rel="nofollow"><code>itertools.tee</code></a>. This takes an iterator, and returns <em>two</em> iterators, each of which can be advanced independently. Under the covers, it obviously has to use some storage to handle things if they get out of sync, which is why the documentation says this:</p> <blockquote> <p>In general, if one iterator uses most or all of the data before another iterator starts, it is faster to use list() instead of tee().</p> </blockquote> <p>And that's the other option: Read the whole iterator into a <code>list</code>, so you can loop over slices.</p> <p>This is clearly one of those cases where one iterator uses most of the data while the other one's sitting around waiting. For example, the first time through the inner loop, you're reading lines 1-20000 before the outer loop reads line 1. So, a <code>list</code> is a better option here. So:</p> <pre><code>f = open('t.log','r') contents = list(f) f.close() for idx, line in enumerate(contents): aline = line.replace(',','').split() if len(aline)==11: for line in contents[idx+1:]: bline = line.replace(',','').split() if len(bline)==11 and aline[2]==bline[2]: print 'a: ', aline print 'b: ', bline </code></pre> <p>Finally, if you have an fancy iterator that can be checkpointed and resumed in some way, you can checkpoint it right before the inner loop, then resume it right after. And fortunately, files happen to have such a thing: <a href="http://docs.python.org/2/library/stdtypes.html#file.tell" rel="nofollow"><code>tell</code></a> returns the current file position, and <a href="http://docs.python.org/2/library/stdtypes.html#file.seek" rel="nofollow"><code>seek</code></a> jumps to a specified position. (There's a big warning saying that "If the file is opened in text mode (without <code>'b'</code>), only offsets returned by <code>tell()</code> are legal." But that's fine; you're only using offsets returned by <code>tell</code> here.)</p> <p>So:</p> <pre><code>f = open('t.log','r') for line in f: aline = line.replace(',','').split() if len(aline)==11: pos = f.tell() for line in f: bline = line.replace(',','').split() if len(bline)==11 and aline[2]==bline[2]: print 'a: ', aline print 'b: ', bline f.seek(pos) </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload