Note that there are some explanatory texts on larger screens.

plurals
  1. POurlib2.urlopen through proxy fails after a few calls
    primarykey
    data
    text
    <p><strong>Edit:</strong> <em>after much fiddling, it seems urlgrabber succeeds where urllib2 fails, even when telling it close the connection after each file. Seems like there might be something wrong with the way urllib2 handles proxies, or with the way I use it ! Anyways, here is the simplest possible code to retrieve files in a loop:</em></p> <pre><code>import urlgrabber for i in range(1, 100): url = "http://www.iana.org/domains/example/" urlgrabber.urlgrab(url, proxies={'http':'http://&lt;user&gt;:&lt;password&gt;@&lt;proxy url&gt;:&lt;proxy port&gt;'}, keepalive=1, close_connection=1, throttle=0) </code></pre> <hr> <p>Hello all !</p> <p>I am trying to write a very simple python script to grab a bunch of files via urllib2.</p> <p>This script needs to work through the proxy at work (my issue does not exist if grabbing files on the intranet, i.e. without the proxy).</p> <p>Said script fails after a couple of requests with "HTTPError: HTTP Error 401: basic auth failed". Any idea why that might be ? It seems the proxy is rejecting my authentication, but why ? The first couple of urlopen requests went through correctly !</p> <p><em>Edit: Adding a sleep of 10 seconds between requests to avoid some kind of throttling that might be done by the proxy did not change the results.</em></p> <p>Here is a simplified version of my script (with identified information stripped, obviously):</p> <pre><code>import urllib2 passmgr = urllib2.HTTPPasswordMgrWithDefaultRealm() passmgr.add_password(None, '&lt;proxy url&gt;:&lt;proxy port&gt;', '&lt;my user name&gt;', '&lt;my password&gt;') authinfo = urllib2.ProxyBasicAuthHandler(passmgr) proxy_support = urllib2.ProxyHandler({"http" : "&lt;proxy http address&gt;"}) opener = urllib2.build_opener(authinfo, proxy_support) urllib2.install_opener(opener) for i in range(100): with open("e:/tmp/images/tst{}.htm".format(i), "w") as outfile: f = urllib2.urlopen("http://www.iana.org/domains/example/") outfile.write(f.read()) </code></pre> <p>Thanks in advance !</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload