Note that there are some explanatory texts on larger screens.

plurals
  1. POHow to extract from nested dictionaries using list comprehension
    primarykey
    data
    text
    <p>I'm trying to extract some data from XML. I'm using <a href="https://github.com/martinblech/xmltodict" rel="nofollow">xmltodict</a> to load the data into a dictionary, then using list comprehensions to pull out individual parts into separate lists. I will later be plotting these using matplotlib.</p> <p>XML:</p> <pre><code>&lt;?xml version="1.0" ?&gt; &lt;MYDATA&gt; &lt;SESSION ID="1234"&gt; &lt;INFO&gt; &lt;BEGIN LOAD="23"/&gt; &lt;/INFO&gt; &lt;TRANSACTION ID="2103645570"&gt; &lt;ANSWER&gt;Hello&lt;/ANSWER&gt; &lt;/TRANSACTION&gt; &lt;TRANSACTION ID="4315547431"&gt; &lt;ANSWER&gt;This is an answer&lt;/ANSWER&gt; &lt;/TRANSACTION&gt; &lt;/SESSION&gt; &lt;SESSION ID="5678"&gt; &lt;INFO&gt; &lt;BEGIN LOAD="28"/&gt; &lt;/INFO&gt; &lt;TRANSACTION ID="4099381642"&gt; &lt;ANSWER&gt;Hello&lt;/ANSWER&gt; &lt;/TRANSACTION&gt; &lt;TRANSACTION ID="1220404184"&gt; &lt;ANSWER&gt;A Different answer&lt;/ANSWER&gt; &lt;/TRANSACTION&gt; &lt;TRANSACTION ID="201506542"&gt; &lt;ANSWER&gt;Yet another one&lt;/ANSWER&gt; &lt;/TRANSACTION&gt; &lt;/SESSION&gt; &lt;/MYDATA&gt; </code></pre> <p>My code:</p> <pre><code>from collections import OrderedDict # doc contains the xml exactly as loaded by xmltodict doc = OrderedDict([(u'MYDATA', OrderedDict([(u'SESSION', [OrderedDict([(u'@ID', u'1234'), (u'INFO', OrderedDict([(u'BEGIN', OrderedDict([(u'@LOAD', u'23')]))])), (u'TRANSACTION', [OrderedDict([(u'@ID', u'2103645570'), (u'ANSWER', u'Hello')]), OrderedDict([(u'@ID', u'4315547431'), (u'ANSWER', u'This is an answer')])])]), OrderedDict([(u'@ID', u'5678'), (u'INFO', OrderedDict([(u'BEGIN', OrderedDict([(u'@LOAD', u'28')]))])), (u'TRANSACTION', [OrderedDict([(u'@ID', u'4099381642'), (u'ANSWER', u'Hello')]), OrderedDict([(u'@ID', u'1220404184'), (u'ANSWER', u'A Different answer')]), OrderedDict([(u'@ID', u'201506542'), (u'ANSWER', u'Yet another one')])])])])]))]) sess_ids = [i['@ID'] for i in doc['MYDATA']['SESSION']] print sess_ids sess_loads = [i['INFO']['BEGIN']['@LOAD'] for i in doc['MYDATA']['SESSION']] print sess_loads trans_ids = [[j['@ID'] for j in i['TRANSACTION']] for i in doc['MYDATA']['SESSION']] print trans_ids </code></pre> <p>Output:</p> <pre><code>sess_ids: [u'1234', u'5678'] sess_loads: [u'23', u'28'] trans_ids: [[u'2103645570', u'4315547431'], [u'4099381642', u'1220404184', u'201506542']] </code></pre> <p>You can see that I'm able to access the ID attributes from the SESSION elements and also the LOAD attributes from the BEGIN elements.</p> <p>I need to get the ID attributes from the TRANSACTION elements as a single list. Currently I'm getting a list of lists in variable <code>trans_ids</code>.</p> <p>How can I get just a flat list of the values?</p> <p>I have tried:</p> <pre><code>[j['@ID'] for j in i['TRANSACTION'] for i in doc['MYDATA']['SESSION']] </code></pre> <p>but that just repeats the second session twice, giving:</p> <pre><code>[u'4099381642', u'4099381642', u'1220404184', u'1220404184', u'201506542', u'201506542'] </code></pre>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload