Note that there are some explanatory texts on larger screens.

plurals
  1. POpython universal feedparser crashing on unicode error
    primarykey
    data
    text
    <p>I am using OSX 10.6 and python 2.7.1 with BeautifulSoup 3.0 and feedparser 5.01. I am trying to parse the New York Times RSS Feed, which validates, and which Beautiful Soup on its own will parse happily.</p> <p>The minimum code to produce the error is:</p> <pre><code>import feedparser from BeautifulSoup import BeautifulSoup feed = feedparser.parse("http://www.nytimes.com/services/xml/rss/nyt/GlobalHome.xml") </code></pre> <ul> <li>It fails if I use either the url or if I use urllib2.urlopen to get the contents. </li> <li>I have also tried the character set detector.</li> </ul> <p>The error block is:</p> <pre><code> /Users/user/Source/python/feed/BeautifulSoup.py:1553: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal elif data[:3] == '\xef\xbb\xbf': /Users/user/Source/python/feed/BeautifulSoup.py:1556: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal elif data[:4] == '\x00\x00\xfe\xff': /Users/user/Source/python/feed/BeautifulSoup.py:1559: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal elif data[:4] == '\xff\xfe\x00\x00': Traceback (most recent call last): File "parse.py", line 5, in &lt;module&gt; feed = feedparser.parse("http://www.nytimes.com/services/xml/rss/nyt/GlobalHome.xml") File "/Users/user/Source/python/feed/feedparser.py", line 3822, in parse feedparser.feed(data.decode('utf-8', 'replace')) File "/Users/user/Source/python/feed/feedparser.py", line 1851, in feed sgmllib.SGMLParser.feed(self, data) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/sgmllib.py", line 104, in feed self.goahead(0) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/sgmllib.py", line 143, in goahead k = self.parse_endtag(i) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/sgmllib.py", line 320, in parse_endtag self.finish_endtag(tag) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/sgmllib.py", line 360, in finish_endtag self.unknown_endtag(tag) File "/Users/user/Source/python/feed/feedparser.py", line 657, in unknown_endtag method() File "/Users/user/Source/python/feed/feedparser.py", line 1545, in _end_description value = self.popContent('description') File "/Users/user/Source/python/feed/feedparser.py", line 961, in popContent value = self.pop(tag) File "/Users/user/Source/python/feed/feedparser.py", line 868, in pop mfresults = _parseMicroformats(output, self.baseuri, self.encoding) File "/Users/user/Source/python/feed/feedparser.py", line 2420, in _parseMicroformats p = _MicroformatsParser(htmlSource, baseURI, encoding) File "/Users/user/Source/python/feed/feedparser.py", line 2024, in __init__ self.document = BeautifulSoup.BeautifulSoup(data) File "/Users/user/Source/python/feed/BeautifulSoup.py", line 1228, in __init__ BeautifulStoneSoup.__init__(self, *args, **kwargs) File "/Users/user/Source/python/feed/BeautifulSoup.py", line 892, in __init__ self._feed() File "/Users/user/Source/python/feed/BeautifulSoup.py", line 917, in _feed SGMLParser.feed(self, markup) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/sgmllib.py", line 103, in feed self.rawdata = self.rawdata + data TypeError: cannot concatenate 'str' and 'NoneType' objects </code></pre> <p>I would appreciate any advice very much.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload