Note that there are some explanatory texts on larger screens.

plurals
  1. PO<script> tag and HTMLParseError
    primarykey
    data
    text
    <p>I'm trying to parse html with BeautifulSoup and got strange error. Here's the minimal code which reproduces the problem. (Windows 7 32-bit, ActivePython 2.7).</p> <pre><code>from bs4 import BeautifulSoup s = """ &lt;html&gt; &lt;script&gt; var pstr = "&lt;li&gt;&lt;font color='blue'&gt;1&lt;/font&gt;&lt;/li&gt;"; for(var lc=0;lc&lt;o.length;lc++){} &lt;/script&gt; &lt;/html&gt; """ p = BeautifulSoup(s) </code></pre> <p>Traceback:</p> <pre><code>Traceback (most recent call last): File "&lt;pyshell#69&gt;", line 1, in &lt;module&gt; p = BeautifulSoup(s) File "C:\Python27\lib\site-packages\bs4\__init__.py", line 168, in __init__ self._feed() File "C:\Python27\lib\site-packages\bs4\__init__.py", line 181, in _feed self.builder.feed(self.markup) File "C:\Python27\lib\site-packages\bs4\builder\_htmlparser.py", line 56, in feed super(HTMLParserTreeBuilder, self).feed(markup) File "C:\Python27\lib\HTMLParser.py", line 108, in feed self.goahead(0) File "C:\Python27\lib\HTMLParser.py", line 148, in goahead k = self.parse_starttag(i) File "C:\Python27\lib\HTMLParser.py", line 229, in parse_starttag endpos = self.check_for_whole_start_tag(i) File "C:\Python27\lib\HTMLParser.py", line 304, in check_for_whole_start_tag self.error("malformed start tag") File "C:\Python27\lib\HTMLParser.py", line 115, in error raise HTMLParseError(message, self.getpos()) HTMLParseError: malformed start tag, at line 5, column 25 </code></pre> <p>If you remove the line starting with 'var pstr = ...', parse will work perfectly. Is there a way to get the correct parse of html code like this?</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload