Note that there are some explanatory texts on larger screens.

plurals
  1. POIterating multiple (parent,child) nodes using Python ElementTree
    primarykey
    data
    text
    <p>The standard implementation of ElementTree for Python (2.6) does not provide pointers to parents from child nodes. Therefore, if parents are needed, it is suggested to loop over parents rather than children.</p> <p>Consider my xml is of the form:</p> <pre><code>&lt;Content&gt; &lt;Para&gt;first&lt;/Para&gt; &lt;Table&gt;&lt;Para&gt;second&lt;/Para&gt;&lt;/Table&gt; &lt;Para&gt;third&lt;/Para&gt; &lt;/Content&gt; </code></pre> <p>The following finds all "Para" nodes without considering parents:</p> <pre><code>(1) paras = [p for p in page.getiterator("Para")] </code></pre> <p>This (adapted from effbot) stores the parent by looping over them instead of the child nodes:</p> <pre><code>(2) paras = [(c,p) for p in page.getiterator() for c in p] </code></pre> <p>This makes perfect sense, and can be extended with a conditional to achieve the (supposedly) same result as (1), but with parent info added:</p> <pre><code>(3) paras = [(c,p) for p in page.getiterator() for c in p if c.tag == "Para"] </code></pre> <p>The <a href="http://docs.python.org/release/2.6.4/library/xml.etree.elementtree.html" rel="nofollow">ElementTree documentation</a> suggests that the getiterator() method does a depth-first search. Running it without looking for the parent (1) yields:</p> <pre><code>first second third </code></pre> <p>However, extracting the text from paras in (3), yields:</p> <pre><code>first, Content&gt;Para third, Content&gt;Para second, Table&gt;Para </code></pre> <p>This appears to be breadth-first.</p> <p>This therefore raises two questions.</p> <ol> <li>Is this correct and expected behaviour?</li> <li>How do you extract (parent, child) tuples when the child must be of a certain type but the parent can be anything, <em>if document order must be maintained</em>. I do not think running two loops and mapping the (parent,child)'s generated by (3) to the orders generated by (1) is ideal.</li> </ol>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload