Note that there are some explanatory texts on larger screens.

plurals
  1. POhttplib is not getting all the redirect codes
    primarykey
    data
    text
    <p>I am trying to get the final url of a page that seems to redirect more than once. Try this sample URL in your browser and compare it to the final URL at the bottom of my code snippet:</p> <p><a href="http://www.usmc.mil/units/hqmc/" rel="nofollow">Link that redirects more than once</a></p> <p>And here is the test code I was running, notice the final URL that gets a code of 200 isn't the same as the one in your browser. What are my options?</p> <pre><code>Python 2.7.1+ (r271:86832, Apr 11 2011, 18:13:53) [GCC 4.5.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. &gt;&gt;&gt; import httplib &gt;&gt;&gt; from urlparse import urlparse &gt;&gt;&gt; url = 'http://www.usmc.mil/units/hqmc/' &gt;&gt;&gt; host = urlparse(url)[1] &gt;&gt;&gt; req = ''.join(urlparse(url)[2:5]) &gt;&gt;&gt; conn = httplib.HTTPConnection(host) &gt;&gt;&gt; conn.request('HEAD', req) &gt;&gt;&gt; resp = conn.getresponse() &gt;&gt;&gt; print resp.status 301 &gt;&gt;&gt; print resp.msg.dict['location'] http://www.marines.mil/units/hqmc/ &gt;&gt;&gt; url = 'http://www.marines.mil/units/hqmc/' &gt;&gt;&gt; host = urlparse(url)[1] &gt;&gt;&gt; req = ''.join(urlparse(url)[2:5]) &gt;&gt;&gt; conn = httplib.HTTPConnection(host) &gt;&gt;&gt; conn.request('HEAD', req) &gt;&gt;&gt; resp = conn.getresponse() &gt;&gt;&gt; print resp.status 302 &gt;&gt;&gt; print resp.msg.dict['location'] http://www.marines.mil/units/hqmc/default.aspx &gt;&gt;&gt; url = 'http://www.marines.mil/units/hqmc/default.aspx' &gt;&gt;&gt; host = urlparse(url)[1] &gt;&gt;&gt; req = ''.join(urlparse(url)[2:5]) &gt;&gt;&gt; conn = httplib.HTTPConnection(host) &gt;&gt;&gt; conn.request('HEAD', req) &gt;&gt;&gt; resp = conn.getresponse() &gt;&gt;&gt; print resp.status 200 &gt;&gt;&gt; print resp.msg.dict['location'] Traceback (most recent call last): File "&lt;stdin&gt;", line 1, in &lt;module&gt; KeyError: 'location' &gt;&gt;&gt; print url http://www.marines.mil/units/hqmc/default.aspx //THIS URL DOES NOT RETURN A 200 IN ANY BROWSER I HAVE TRIED </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload