Note that there are some explanatory texts on larger screens.

plurals
  1. POEmulating a browser to download a file?
    primarykey
    data
    text
    <p>There's an FLV file on the web that can be downloaded directly in Chrome. The file is a television program, published by CCTV (China Central Television). CCTV is a non-profit, state-owned broadcaster, financed by the Chinese tax payer, which allows us to download their content without infringing copyrights.</p> <p>Using <code>wget</code>, I can download the file from a different address, but not from the address that works in Chrome.</p> <p>This is what I've tried to do:</p> <pre><code>url='http://114.80.235.200/f4v/94/163005294.h264_1.f4v?10000&amp;key=7b9b1155dc632cbab92027511adcb300401443020d&amp;amp;playtype=1&amp;amp;tk=163659644989925531390490125&amp;amp;brt=2&amp;amp;bc=0&amp;amp;nt=0&amp;amp;du=1496650&amp;amp;ispid=23&amp;amp;rc=200&amp;amp;inf=1&amp;amp;si=11000&amp;amp;npc=1606&amp;amp;pp=0&amp;amp;ul=2&amp;amp;mt=-1&amp;amp;sid=10000&amp;amp;au=0&amp;amp;pc=0&amp;amp;cip=222.73.44.31&amp;amp;hf=0&amp;amp;id=tudou&amp;amp;itemid=135558267&amp;amp;fi=163005294&amp;amp;sz=59138302' wget -c $url --user-agent="" -O xfgs.f4v </code></pre> <p>This doesn't work either:</p> <pre><code>wget -c $url -O xfgs.f4v </code></pre> <p>The output is:</p> <pre><code>Connecting to 118.26.57.12:80... connected. HTTP request sent, awaiting response... 403 Forbidden 2013-02-13 09:50:42 ERROR 403: Forbidden. </code></pre> <p>What am I doing wrong?</p> <p>I ultimately want to download it with the Python library <code>mechanize</code>. Here is the code I'm using for that:</p> <pre><code>import mechanize br = mechanize.Browser() br = mechanize.Browser() br.set_handle_robots(False) br.set_handle_equiv(False) br.addheaders = [('User-agent', 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1')] url='http://114.80.235.200/f4v/94/163005294.h264_1.f4v?10000&amp;key=7b9b1155dc632cbab92027511adcb300401443020d&amp;amp;playtype=1&amp;amp;tk=163659644989925531390490125&amp;amp;brt=2&amp;amp;bc=0&amp;amp;nt=0&amp;amp;du=1496650&amp;amp;ispid=23&amp;amp;rc=200&amp;amp;inf=1&amp;amp;si=11000&amp;amp;npc=1606&amp;amp;pp=0&amp;amp;ul=2&amp;amp;mt=-1&amp;amp;sid=10000&amp;amp;au=0&amp;amp;pc=0&amp;amp;cip=222.73.44.31&amp;amp;hf=0&amp;amp;id=tudou&amp;amp;itemid=135558267&amp;amp;fi=163005294&amp;amp;sz=59138302' r = br.open(url).read() tofile=open("/tmp/xfgs.f4v","w") tofile.write(r) tofile.close() </code></pre> <p>This is the result:</p> <pre><code>Traceback (most recent call last): File "&lt;stdin&gt;", line 1, in &lt;module&gt; File "/usr/lib/python2.7/dist-packages/mechanize/_mechanize.py", line 203, in open return self._mech_open(url, data, timeout=timeout) File "/usr/lib/python2.7/dist-packages/mechanize/_mechanize.py", line 255, in _mech_open raise response mechanize._response.httperror_seek_wrapper: HTTP Error 403: Forbidden </code></pre> <p>Can anyone explain how to get the <code>mechanize</code> code to work please?</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload