Note that there are some explanatory texts on larger screens.

plurals
  1. POPython 3.2 lxml fill and submit form, select multiple, how to do it? value not working
    primarykey
    data
    text
    <p>Great page this one, coming from the perl world and after several years of doing nothing, I've re-started to program again (this web page didn't exist, how things change). And now, after a 2 full-days of searching, I play the last card of asking here for help.</p> <p>Working under mac environment, with python 3.2 and lxml 2.3 (installed following www.jtmoon.com/?p=21), what I am trying to do:</p> <ul> <li>web: <a href="http://biodbnet.abcc.ncifcrf.gov/db/db2db.php" rel="nofollow">http://biodbnet.abcc.ncifcrf.gov/db/db2db.php</a></li> <li>to fill the form that you find there</li> <li>to submit it</li> </ul> <p>My code. I put several attempts and the output code.</p> <pre><code>from lxml.html import parse, submit_form, tostring page = parse('http://biodbnet.abcc.ncifcrf.gov/db/db2db.php').getroot() page.forms[0].fields['input'] = 'GI Number' page.forms[0].inputs['outputs[]'].value = 'Gene ID' page.forms[0].fields['hasComma'] = 'no' page.forms[0].fields['removeDupValues'] = 'yes' page.forms[0].fields['request'] = 'db2db' page.forms[0].action = 'http://biodbnet.abcc.ncifcrf.gov/db/db2dbRes.php' page.forms[0].fields['idList'] = '86439006' submit_form(page.forms[0]) </code></pre> <p>Output:</p> <pre><code>File "/Users/gerard/Desktop/barbacue/MGFtoXML.py", line 30, in &lt;module&gt; page.forms[0].inputs['outputs[]'].value = 'Gene ID' File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/lxml/html/__init__.py", line 1058, in _value__set "You must pass in a sequence") TypeError: You must pass in a sequence </code></pre> <p>So, since that element is a multi-select element, I understand that I have to give a list</p> <pre><code>page.forms[0].inputs['outputs[]'].value = list('Gene ID') </code></pre> <p>Output:</p> <pre><code>File "/Users/gerard/Desktop/barbacue/MGFtoXML.py", line 30, in &lt;module&gt; page.forms[0].inputs['outputs[]'].value = list('Gene ID') File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/lxml/html/__init__.py", line 1059, in _value__set self.value.clear() File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/lxml/html/_setmixin.py", line 115, in clear self.remove(item) File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/lxml/html/__init__.py", line 1159, in remove "The option %r is not currently selected" % item) ValueError: The option 'Affy ID' is not currently selected </code></pre> <p>'Affy ID' is the first option value of the list, and it is not selected. But what's the problem with it?</p> <p>Surprisingly, if I instead put</p> <pre><code>page.forms[0].inputs['outputs[]'].multiple = list('Gene ID') #page.forms[0].inputs['outputs[]'].value = list('Gene ID') </code></pre> <p>Then, somehow lxml likes it, and move on. However, the multiple attribute should be a boolean (actually it is if I print the value), I shouldn't touch it, and the "value" of the item should actually point to the selected items, according to the lxml docs.</p> <p>The new output</p> <pre><code>File "/Users/gerard/Desktop/barbacue/MGFtoXML.py", line 87, in &lt;module&gt; submit_form(page.forms[0]) File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/lxml/html/__init__.py", line 856, in submit_form return open_http(form.method, url, values) File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/lxml/html/__init__.py", line 876, in open_http_urllib return urlopen(url, data) File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/urllib/request.py", line 138, in urlopen return opener.open(url, data, timeout) File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/urllib/request.py", line 364, in open req = meth(req) File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/urllib/request.py", line 1052, in do_request_ raise TypeError("POST data should be bytes" TypeError: POST data should be bytes or an iterable of bytes. It cannot be str. </code></pre> <p>So, what can be done?? I am sure that with python 2.6 I could use mecanize, or that perhaps lxml could work? But I really don't want to code in a sort-of deprecated version. I am enjoying a lot python, but I am starting to consider going back to perl. Perhaps this could be a smart movement??</p> <p>Any help will be hugely appreciated</p> <p>Gerard</p> <ul> <li>Reading in this forum, I find pythonpaste.org, could it be a replacement for lxml?</li> </ul>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload