Note that there are some explanatory texts on larger screens.

plurals
  1. POUnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 21: invalid start byte
    primarykey
    data
    text
    <p>I am using BeautifulSoup to get an article off <a href="http://www.reuters.com/article/2012/04/01/net-us-foxconn-idUSBRE83004E20120401" rel="nofollow">http://www.reuters.com/article/2012/04/01/net-us-foxconn-idUSBRE83004E20120401</a></p> <pre><code>url="http://www.reuters.com/article/2012/10/19/us-yahoo-korea-idUSBRE89I0EY20121019" source = urllib2.urlopen(url) soup = BeautifulSoup(source) </code></pre> <p>but receive the error UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 107: invalid start byte</p> <p>I have tried to use soup.encode(['windows-1252','ascii','iso-8859']) but in the first place, the soup cannot even be created.</p> <p>Does anyone have any tips to share?</p> <p>The error traceback, if it helps:</p> <pre><code>Traceback (most recent call last): File "&lt;pyshell#17&gt;", line 1, in &lt;module&gt; parseReuters() File "C:\Users\name\Desktop\test.py", line 39, in parseReuters soup = BeautifulSoup(source) File "C:\Python27\lib\site-packages\bs4\__init__.py", line 172, in __init__ self._feed() File "C:\Python27\lib\site-packages\bs4\__init__.py", line 185, in _feed self.builder.feed(self.markup) File "C:\Python27\lib\site-packages\bs4\builder\_lxml.py", line 195, in feed self.parser.close() File "parser.pxi", line 1209, in lxml.etree._FeedParser.close (src\lxml\lxml.etree.c:90597) File "parsertarget.pxi", line 142, in lxml.etree._TargetParserContext._handleParseResult (src\lxml\lxml.etree.c:99984) File "parsertarget.pxi", line 130, in lxml.etree._TargetParserContext._handleParseResult (src\lxml\lxml.etree.c:99807) File "lxml.etree.pyx", line 294, in lxml.etree._ExceptionContext._raise_if_stored (src\lxml\lxml.etree.c:9383) File "saxparser.pxi", line 259, in lxml.etree._handleSaxData (src\lxml\lxml.etree.c:95945) UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 0: invalid start byte </code></pre>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload