Note that there are some explanatory texts on larger screens.

plurals
  1. POPerformance difference between urllib2 and asyncore
    primarykey
    data
    text
    <p>I have some questions about the performance of this simple python script:</p> <pre><code>import sys, urllib2, asyncore, socket, urlparse from timeit import timeit class HTTPClient(asyncore.dispatcher): def __init__(self, host, path): asyncore.dispatcher.__init__(self) self.create_socket(socket.AF_INET, socket.SOCK_STREAM) self.connect( (host, 80) ) self.buffer = 'GET %s HTTP/1.0\r\n\r\n' % path self.data = '' def handle_connect(self): pass def handle_close(self): self.close() def handle_read(self): self.data += self.recv(8192) def writable(self): return (len(self.buffer) &gt; 0) def handle_write(self): sent = self.send(self.buffer) self.buffer = self.buffer[sent:] url = 'http://pacnet.karbownicki.com/api/categories/' components = urlparse.urlparse(url) host = components.hostname or '' path = components.path def fn1(): try: response = urllib2.urlopen(url) try: return response.read() finally: response.close() except: pass def fn2(): client = HTTPClient(host, path) asyncore.loop() return client.data if sys.argv[1:]: print 'fn1:', len(fn1()) print 'fn2:', len(fn2()) time = timeit('fn1()', 'from __main__ import fn1', number=1) print 'fn1: %.8f sec/pass' % (time) time = timeit('fn2()', 'from __main__ import fn2', number=1) print 'fn2: %.8f sec/pass' % (time) </code></pre> <p>Here's the output I'm getting on linux:</p> <pre><code>$ python2 test_dl.py fn1: 5.36162281 sec/pass fn2: 0.27681994 sec/pass $ python2 test_dl.py count fn1: 11781 fn2: 11965 fn1: 0.30849886 sec/pass fn2: 0.30597305 sec/pass </code></pre> <p>Why is urllib2 so much slower than asyncore in the first run?</p> <p>And why does the discrepancy seem to disappear on the second run?</p> <p><strong>EDIT</strong>: Found a hackish solution to this problem here: <a href="https://stackoverflow.com/questions/2014534/force-python-mechanize-urllib2-to-only-use-a-requests/6319043#6319043">Force python mechanize/urllib2 to only use A requests?</a></p> <p>The five-second delay disappears if I monkey-patch the socket module as follows:</p> <pre><code>_getaddrinfo = socket.getaddrinfo def getaddrinfo(host, port, family=0, socktype=0, proto=0, flags=0): return _getaddrinfo(host, port, socket.AF_INET, socktype, proto, flags) socket.getaddrinfo = getaddrinfo </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload