Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>You run <code>self.feed()</code>, and thus call <code>handle_data()</code> (judging by the trace), before you run <code>self.intitle = ""</code>.<br> Fix:</p> <pre><code> self.url = url self.data = urllib.urlopen(url).read() # Perhaps there should be a decode() here? self.intitle = False self.mytitle = "" self.feed(self.data) </code></pre> <h2>---------------------------------------</h2> <p>Debugging is always the most important part. Run this code and see what it prints.</p> <pre><code>from HTMLParser import HTMLParser import urllib, sys class MyHTMLParser(HTMLParser): def __init__(self, url): HTMLParser.__init__(self) self.url = url self.data = urllib.urlopen(url).read() self.in_title = False self.title = '' self.feed(self.data) def handle_starttag(self, tag, attrs): if tag == 'body': sys.exit('Found &lt;body&gt;, quitting') # Much easier to look at self.in_title = (tag == 'title') print 'Handled start of', tag, ' in_title is', self.in_title def handle_endtag(self, tag): print 'Handled end of', tag def handle_data(self, data): print "Handling data:", repr(data) if self.in_title: print "Apparently, we are in a &lt;title&gt; tag. self.title is now", repr(data) self.title = data print data return self.title parser = MyHTMLParser("http://www.york.ac.uk/teaching/cws/wws/webpage1.html") </code></pre> <p>For convenience, the HTML for the page in question:</p> <pre><code>&lt;HMTL&gt; &lt;HEAD&gt; &lt;TITLE&gt;webpage1&lt;/TITLE&gt; &lt;/HEAD&gt; &lt;BODY BGCOLOR="FFFFFf" LINK="006666" ALINK="8B4513" VLINK="006666"&gt; </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload