Note that there are some explanatory texts on larger screens.

plurals
  1. POParse xml with lxml - extract element value
    primarykey
    data
    text
    <p>Let's suppose we have the XML file with the structure as follows.</p> <pre><code>&lt;?xml version="1.0" ?&gt; &lt;searchRetrieveResponse xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.loc.gov/zing/srw/ http://www.loc.gov/standards/sru/sru1-1archive/xml-files/srw-types.xsd" xmlns="http://www.loc.gov/zing/srw/"&gt; &lt;records xmlns:ns1="http://www.loc.gov/zing/srw/"&gt; &lt;record&gt; &lt;recordData&gt; &lt;record xmlns=""&gt; &lt;datafield tag="000"&gt; &lt;subfield code="a"&gt;123&lt;/subfield&gt; &lt;subfield code="b"&gt;456&lt;/subfield&gt; &lt;/datafield&gt; &lt;datafield tag="001"&gt; &lt;subfield code="a"&gt;789&lt;/subfield&gt; &lt;subfield code="b"&gt;987&lt;/subfield&gt; &lt;/datafield&gt; &lt;/record&gt; &lt;/recordData&gt; &lt;/record&gt; &lt;record&gt; &lt;recordData&gt; &lt;record xmlns=""&gt; &lt;datafield tag="000"&gt; &lt;subfield code="a"&gt;123&lt;/subfield&gt; &lt;subfield code="b"&gt;456&lt;/subfield&gt; &lt;/datafield&gt; &lt;datafield tag="001"&gt; &lt;subfield code="a"&gt;789&lt;/subfield&gt; &lt;subfield code="b"&gt;987&lt;/subfield&gt; &lt;/datafield&gt; &lt;/record&gt; &lt;/recordData&gt; &lt;/record&gt; &lt;/records&gt; &lt;/searchRetrieveResponse&gt; </code></pre> <p>I need to parse out:</p> <ul> <li>The content of the "subfield" (e.g. 123 in the example above) and</li> <li>Attribute values (e.g. 000 or 001)</li> </ul> <p>I wonder how to do that using lxml and XPath. Pasted below is my initial code and I kindly ask someone to explain me, how to parse out values.</p> <pre><code>import urllib, urllib2 from lxml import etree url = "https://dl.dropbox.com/u/540963/short_test.xml" fp = urllib2.urlopen(url) doc = etree.parse(fp) fp.close() ns = {'xsi':'http://www.loc.gov/zing/srw/'} for record in doc.xpath('//xsi:record', namespaces=ns): print record.xpath("xsi:recordData/record/datafield[@tag='000']", namespaces=ns) </code></pre>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload