Note that there are some explanatory texts on larger screens.

plurals
  1. POURL works fine from browser or wget, but comes up empty from Python or cURL
    primarykey
    data
    text
    <p>When I try to open <a href="http://www.comicbookdb.com/browse.php">http://www.comicbookdb.com/browse.php</a> (which works fine in my browser) from Python, I get an empty response:</p> <pre><code>&gt;&gt;&gt; import urllib.request &gt;&gt;&gt; content = urllib.request.urlopen('http://www.comicbookdb.com/browse.php') &gt;&gt;&gt; print(content.read()) b'' </code></pre> <p>The same also happens when I set a User-agent:</p> <pre><code>&gt;&gt;&gt; opener = urllib.request.build_opener() &gt;&gt;&gt; opener.addheaders = [('User-agent', 'Mozilla/5.0')] &gt;&gt;&gt; content = opener.open('http://www.comicbookdb.com/browse.php') &gt;&gt;&gt; print(content.read()) b'' </code></pre> <p>Or when I use httplib2 instead:</p> <pre><code>&gt;&gt;&gt; import httplib2 &gt;&gt;&gt; h = httplib2.Http('.cache') &gt;&gt;&gt; response, content = h.request('http://www.comicbookdb.com/browse.php') &gt;&gt;&gt; print(content) b'' &gt;&gt;&gt; print(response) {'cache-control': 'no-store, no-cache, must-revalidate, post-check=0, pre-check=0', 'content-location': 'http://www.comicbookdb.com/browse.php', 'expires': 'Thu, 19 Nov 1981 08:52:00 GMT', 'content-length': '0', 'set-cookie': 'PHPSESSID=590f5997a91712b7134c2cb3291304a8; path=/', 'date': 'Wed, 25 Dec 2013 15:12:30 GMT', 'server': 'Apache', 'pragma': 'no-cache', 'content-type': 'text/html', 'status': '200'} </code></pre> <p>Or when I try to download it using cURL:</p> <pre><code>C:\&gt;curl -v http://www.comicbookdb.com/browse.php * About to connect() to www.comicbookdb.com port 80 * Trying 208.76.81.137... * connected * Connected to www.comicbookdb.com (208.76.81.137) port 80 &gt; GET /browse.php HTTP/1.1 User-Agent: curl/7.13.1 (i586-pc-mingw32msvc) libcurl/7.13.1 zlib/1.2.2 Host: www.comicbookdb.com Pragma: no-cache Accept: */* &lt; HTTP/1.1 200 OK &lt; Date: Wed, 25 Dec 2013 15:20:06 GMT &lt; Server: Apache &lt; Expires: Thu, 19 Nov 1981 08:52:00 GMT &lt; Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0 &lt; Pragma: no-cache &lt; Set-Cookie: PHPSESSID=0a46f2d390639da7eb223ad47380b394; path=/ &lt; Content-Length: 0 &lt; Content-Type: text/html * Connection #0 to host www.comicbookdb.com left intact * Closing connection #0 </code></pre> <p>Opening the URL in a browser or downloading it with Wget seems to work fine, though:</p> <pre><code>C:\&gt;wget http://www.comicbookdb.com/browse.php --16:16:26-- http://www.comicbookdb.com/browse.php =&gt; `browse.php' Resolving www.comicbookdb.com... 208.76.81.137 Connecting to www.comicbookdb.com[208.76.81.137]:80... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/html] [ &lt;=&gt; ] 40,687 48.75K/s 16:16:27 (48.75 KB/s) - `browse.php' saved [40687] </code></pre> <p>As does downloading a different file from the same server:</p> <pre><code>&gt;&gt;&gt; content = urllib.request.urlopen('http://www.comicbookdb.com/index.php') &gt;&gt;&gt; print(content.read(100)) b'&lt;!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"\n\t\t"http://www.w3.org/TR/1999/REC-html' </code></pre> <p>So why doesn't the other URL work?</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload