Note that there are some explanatory texts on larger screens.

plurals
  1. POpyqt4 seg fault sequential app start stop
    primarykey
    data
    text
    <p>I'm trying to read webpages using pyqt. I need to call a method multiple times with different URLs. I am currently using code similar to: <a href="http://blog.sitescraper.net/2010/06/scraping-javascript-webpages-in-python.html#comment-form" rel="nofollow">http://blog.sitescraper.net/2010/06/scraping-javascript-webpages-in-python.html#comment-form</a></p> <p>However when I try I get seg faults. Any suggestions welcome.</p> <pre><code>import sys from time import clock from PyQt4.QtGui import * from PyQt4.QtCore import * from PyQt4.QtWebKit import * from PyQt4.QtNetwork import * class Render(QWebPage): def __init__(self): self.app = QApplication(sys.argv) QWebPage.__init__(self) self.networkAccessManager().finished.connect(self.handleEnd) self.loadFinished.connect(self._loadFinished) self.mainFrame().setScrollBarPolicy(Qt.Horizontal, Qt.ScrollBarAlwaysOff) self.mainFrame().setScrollBarPolicy(Qt.Vertical, Qt.ScrollBarAlwaysOff) def loadURL(self, url): self.mainFrame().load(QUrl(url)) self.app.exec_() def savePageImage (self, width, height, Imagefile): pageSize = self.mainFrame().contentsSize(); if width == 0: pageWidth = pageSize.width() else: pageWidth = width if height == 0: pageHeight = pageSize.height() else: pageHeight = height self.setViewportSize(QSize(pageWidth, pageHeight)) Img = QImage(self.viewportSize(), QImage.Format_ARGB32) painter = QPainter(Img) self.mainFrame().render(painter) painter.end() Img.save(Imagefile) def _loadFinished(self, result): print "load finish" self.frame = self.mainFrame() self.returnVal = result self.app.quit() def handleEnd (self, reply): # get first http code and disconnect # could add filter to listen relevant responses self.httpcode = reply.attribute(QNetworkRequest.HttpStatusCodeAttribute) self.networkAccessManager().finished.disconnect(self.handleEnd) jsrurl = 'http://www.w3resource.com/javascript/document-alert-confirm/four.html' badurl='something.or.other' badhttp = 'http://eclecticself.com/test2.html' testurl = 'http://www.nydailynews.com/entertainment/index.html' testurl2 = 'http://www.palmbeachpost.com/' testurl3 = 'http://www.nydailynews.com/news/politics/2011/08/03/2011-08-03_pat_buchanan_downplays_controversy_after_calling_president_obama_your_boy_to_rev.html' url = testurl start = clock() r = Render() r.loadURL(url) html = r.frame.toHtml() elapsed = clock() - start print elapsed if (r.returnVal == True): if (r.httpcode.toInt()[0] != 404): #print html.toUtf8() start = clock() r.savePageImage(1024, 0, "pageSnapshot.png") elapsed = clock() - start print elapsed else: print 'page not found' else: print 'badurl' s = Render() s.loadURL(jsrurl) html = s.frame.toHtml() elapsed = clock() - start print elapsed if (s.returnVal == True): if (s.httpcode.toInt()[0] != 404): print html.toUtf8() start = clock() s.savePageImage(1024, 0, "pageSnapshot.png") elapsed = clock() - start print elapsed else: print 'page not found' else: print 'badurl' </code></pre>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload