Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<h1>How I solved my problem</h1> <h2>A Perl port of <code>pickle.py</code></h2> <p>Following J.F. Sebastian's comment about how simple the the <code>pickle</code> format is, I went out to port parts of <code>pickle.py</code> to Perl. A couple of quick regular expressions would have been a faster way to access my data, but I felt that the hack value and an opportunity to learn more about Python would be be worth it. Plus, I still feel much more comfortable using (and debugging code in) Perl than Python.</p> <p>Most of the porting effort (simple types, tuples, lists, dictionaries) went very straightforward. Perl's and Python's different notions of classes and objects has been the only issue so far where a bit more than simple translation of idioms was needed. The result is a module called <code>Pickle::Parse</code> which after a bit of polishing will be published on CPAN.</p> <p>A module called <code>Python::Serialise::Pickle</code> existed on CPAN, but I found its parsing capabilities lacking: It spews debugging output all over the place and doesn't seem to support classes/objects.</p> <h2>Parsing, transforming data, detecting actual errors in the stream</h2> <p>Based upon <code>Pickle::Parse</code>, I tried to parse the <code>feeds.dat</code> file. After a few iteration of fixing trivial bugs in my parsing code, I got an error message that was strikingly similar to <code>pickle.py</code>'s original <em>object not callable</em> error message:</p> <pre><code>Can't use string ("sxOYAAuyzSx0WqN3BVPjE+6pgPU") as a subroutine ref while "strict refs" in use at lib/Pickle/Parse.pm line 489, &lt;STDIN&gt; line 187102. </code></pre> <p>Ha! Now we're at a point where it's quite likely that the actual data stream is broken. Plus, we get an idea <em>where</em> it is broken.</p> <p>It turned out that the first line of the following sequence was wrong:</p> <pre><code>g7724 ((I2009 I3 I19 I1 I19 I31 I3 I78 I0 t(dtRp62457 </code></pre> <p>Position 7724 in the "memo" pointed to that string <code>"sxOYAAuyzSx0WqN3BVPjE+6pgPU"</code>. From similar records earlier in the stream, it was clear that a <code>time.struct_time</code> object was needed instead. All later records shared this wrong pointer. With a simple search/replace operation, it was trivial to fix this.</p> <p>I find it ironic that I found the source of the error by accident through Perl's feature that tells the user its position in the input data stream when it dies.</p> <h2>Conclusion</h2> <ol> <li>I will move away from <code>rss2email</code> as soon as I find time to automatically transform its pickled configuration/state mess to another tool's format.</li> <li><code>pickle.py</code> needs more meaningful error messages that tell the user about the position of the data stream (not the poision in its own code) where things go wrong.</li> <li>Porting parts <code>pickle.py</code> to Perl was fun and, in the end, rewarding.</li> </ol>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload