Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Take a look at these two versions</p> <pre><code>import lxml.html, lxml.etree url_cooking = 'http://cooking.stackexchange.com/feeds' #lxml.etree version data = lxml.etree.parse(url_cooking) summary_nodes = data.xpath('.//feed/entry/summary') print('Found ' + str(len(summary_nodes)) + ' summary nodes') #lxml.html version data = lxml.html.parse(url_cooking) summary_nodes = data.xpath('.//feed/entry/summary') print('Found ' + str(len(summary_nodes)) + ' summary nodes') </code></pre> <p>As you discovered, the second version returns no nodes, but the <code>lxml.html</code> version works fine. The <code>etree</code> version is not working because it's expecting namespaces and the <code>html</code> version is working because it ignores namespaces. Part way down <a href="http://lxml.de/lxmlhtml.html">http://lxml.de/lxmlhtml.html</a>, it says "The HTML parser notably ignores namespaces and some other XMLisms." </p> <p>Note when you print the root node of the etree version (<code>print(data.getroot())</code>), you get something like <code>&lt;Element {http://www.w3.org/2005/Atom}feed at 0x22d1620&gt;</code>. That means it's a feed element with a namespace of <code>http://www.w3.org/2005/Atom</code>. Here is a corrected version of the etree code.</p> <pre><code>import lxml.html, lxml.etree url_cooking = 'http://cooking.stackexchange.com/feeds' ns = 'http://www.w3.org/2005/Atom' ns_map = {'ns': ns} data = lxml.etree.parse(url_cooking) summary_nodes = data.xpath('//ns:feed/ns:entry/ns:summary', namespaces=ns_map) print('Found ' + str(len(summary_nodes)) + ' summary nodes') </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload