Note that there are some explanatory texts on larger screens.

plurals
  1. POPython CGI Script (using XML & mindom) returns unexpected results
    primarykey
    data
    text
    <p>I am trying to parse the XML returned by search engine APIs (Bing, Yahoo &amp; Blekko). The returned XML (for sample search query 'sushi') from Blekko takes the form:</p> <pre><code>&lt;rss version="2.0"&gt; &lt;channel&gt; &lt;title&gt;blekko | rss for &amp;quot;sushi/rss /ps=100&amp;quot;&lt;/title&gt; &lt;link&gt;http://blekko.com/?q=sushi%2Frss+%2Fps%3D100&lt;/link&gt; &lt;description&gt;Blekko search for &amp;quot;sushi/rss /ps=100&amp;quot;&lt;/description&gt; &lt;language&gt;en-us&lt;/language&gt; &lt;copyright&gt;Copyright 2011 Blekko, Inc.&lt;/copyright&gt; &lt;docs&gt;http://cyber.law.harvard.edu/rss/rss.html&lt;/docs&gt; &lt;webMaster&gt;webmaster@blekko.com&lt;/webMaster&gt; &lt;rescount&gt;3M&lt;/rescount&gt; &lt;item&gt; &lt;title&gt;Sushi - Wikipedia&lt;/title&gt; &lt;link&gt;http://en.wikipedia.org/wiki/Sushi&lt;/link&gt; &lt;guid&gt;http://en.wikipedia.org/wiki/Sushi&lt;/guid&gt; &lt;description&gt;Article about sushi, a food made of vinegared rice combined with various toppings or fillings. Sushi ( &amp;#x3059;&amp;#x3057;&amp;#x3001;&amp;#x5bff;&amp;#x53f8;, &amp;#x9ba8;, &amp;#x9b93;, &amp;#x5bff;&amp;#x6597;, &amp;#x5bff;&amp;#x3057;, &amp;#x58fd;&amp;#x53f8;.&lt;/description&gt; &lt;/item&gt; &lt;/channel&gt; &lt;/rss&gt; </code></pre> <p>The section of python code to extract the required search result data is:</p> <pre><code>for counter in range(100): try: for item in BlekkoSearchResultsXML.getElementsByTagName('item'): Blekko_PageTitle = item.getElementsByTagName('title')[counter].toxml(encoding="utf-8") Blekko_PageDesc = item.getElementsByTagName('description')[counter].toxml(encoding="utf-8") Blekko_DisplayURL = item.getElementsByTagName('guid')[counter].toxml(encoding="utf-8") Blekko_URL = item.getElementsByTagName('link')[counter].toxml(encoding="utf-8") print "&lt;h2&gt;" + Blekko_PageTitle + "&lt;/h2&gt;&lt;br /&gt;" print Blekko_PageDesc + "&lt;br /&gt;" print Blekko_DisplayURL + "&lt;br /&gt;" print Blekko_URL + "&lt;br /&gt;" except IndexError: break </code></pre> <p>The code will not extract the Page Title of each search result returned, but does extract the rest of the info. </p> <p>Furthermore, if I do not have the code:</p> <pre><code>print "&lt;title&gt;Page title to appear on browser tab&lt;/title&gt;" </code></pre> <p>somewhere in the script, the title from the first search result is taken as the page title (i.e. the page appears with the title 'Sushi - Wikipedia' in the browser). If I do have a page title, the code still does not extract the page title from the search result.</p> <p>The same code (with different tag names etc.) has the same problem with the Yahoo search API, but works fine with the Bing search API.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload