Note that there are some explanatory texts on larger screens.

plurals
  1. POFind on beautiful soup in loop returns TypeError
    text
    copied!<p>I'm trying to scrape a table on an ajax page with Beautiful Soup and print it out in table form with the TextTable library.</p> <pre><code>import BeautifulSoup import urllib import urllib2 import getpass import cookielib import texttable cj = cookielib.CookieJar() opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj)) urllib2.install_opener(opener) ... def show_queue(): url = 'https://www.animenfo.com/radio/nowplaying.php' values = {'ajax' : 'true', 'mod' : 'queue'} data = urllib.urlencode(values) f = opener.open(url, data) soup = BeautifulSoup.BeautifulSoup(f) stable = soup.find('table') table = texttable.Texttable() header = stable.findAll('th') header_text = [] for th in header: header_append = th.find(text=True) header.append(header_append) table.header(header_text) rows = stable.find('tr') for tr in rows: cells = [] cols = tr.find('td') for td in cols: cells_append = td.find(text=True) cells.append(cells_append) table.add_row(cells) s = table.draw print s ... </code></pre> <p>Although the URL for the HTML in question I'm trying to scrape is shown in the code, here is an example of it:</p> <pre><code>&lt;table cellspacing="0" cellpadding="0"&gt; &lt;tbody&gt; &lt;tr&gt; &lt;th&gt;Artist - Title&lt;/th&gt; &lt;th&gt;Album&lt;/th&gt; &lt;th&gt;Album Type&lt;/th&gt; &lt;th&gt;Series&lt;/th&gt; &lt;th&gt;Duration&lt;/th&gt; &lt;th&gt;Type of Play&lt;/th&gt; &lt;th&gt; &lt;span title="..."&gt;Time to play&lt;/span&gt; &lt;/th&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td class="row1"&gt; &lt;a href="..." class="songinfo"&gt;Song 1&lt;/a&gt; &lt;/td&gt; &lt;td class="row1"&gt; &lt;a href="..." class="album_link"&gt;Album 1&lt;/a&gt; &lt;/td&gt; &lt;td class="row1"&gt;...&lt;/td&gt; &lt;td class="row1"&gt; &lt;/td&gt; &lt;td class="row1" style="text-align: center"&gt; 5:43 &lt;/td&gt; &lt;td class="row1" style="padding-left: 5px; text-align: center"&gt; S.A.M. &lt;/td&gt; &lt;td class="row1" style="text-align: center"&gt; ~0:00:00 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td class="row2"&gt; &lt;a href="..." class="songinfo"&gt;Song2&lt;/a&gt; &lt;/td&gt; &lt;td class="row2"&gt; &lt;a href="..." class="album_link"&gt;Album 2&lt;/a&gt; &lt;/td&gt; &lt;td class="row2"&gt;...&lt;/td&gt; &lt;td class="row2"&gt; &lt;/td&gt; &lt;td class="row2" style="text-align: center"&gt; 6:16 &lt;/td&gt; &lt;td class="row2" style="padding-left: 5px; text-align: center"&gt; S.A.M. &lt;/td&gt; &lt;td class="row2" style="text-align: center"&gt; ~0:05:43 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td class="row1"&gt; &lt;a href="..." class="songinfo"&gt;Song 3&lt;/a&gt; &lt;/td&gt; &lt;td class="row1"&gt; &lt;a href="..." class="album_link"&gt;Album 3&lt;/a&gt; &lt;/td&gt; &lt;td class="row1"&gt;...&lt;/td&gt; &lt;td class="row1"&gt; &lt;/td&gt; &lt;td class="row1" style="text-align: center"&gt; 4:13 &lt;/td&gt; &lt;td class="row1" style="padding-left: 5px; text-align: center"&gt; S.A.M. &lt;/td&gt; &lt;td class="row1" style="text-align: center"&gt; ~0:11:59 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td class="row2"&gt; &lt;a href="..." class="songinfo"&gt;Song 4&lt;/a&gt; &lt;/td&gt; &lt;td class="row2"&gt; &lt;a href="..." class="album_link"&gt;Album 4&lt;/a&gt; &lt;/td&gt; &lt;td class="row2"&gt;...&lt;/td&gt; &lt;td class="row2"&gt; &lt;/td&gt; &lt;td class="row2" style="text-align: center"&gt; 5:34 &lt;/td&gt; &lt;td class="row2" style="padding-left: 5px; text-align: center"&gt; S.A.M. &lt;/td&gt; &lt;td class="row2" style="text-align: center"&gt; ~0:16:12 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td class="row1"&gt;&lt;a href="..." class="songinfo"&gt;Song 5&lt;/a&gt; &lt;/td&gt; &lt;td class="row1"&gt; &lt;a href="..." class="album_link"&gt;Album 5&lt;/a&gt; &lt;/td&gt; &lt;td class="row1"&gt;...&lt;/td&gt; &lt;td class="row1"&gt;&lt;/td&gt; &lt;td class="row1" style="text-align: center"&gt; 4:23 &lt;/td&gt; &lt;td class="row1" style="padding-left: 5px; text-align: center"&gt; S.A.M. &lt;/td&gt; &lt;td class="row1" style="text-align: center"&gt; ~0:21:46 &lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td style="height: 5px;"&gt; &lt;/td&gt;&lt;/tr&gt; &lt;tr&gt; &lt;td class="row2" style="font-style: italic; text-align: center;" colspan="5"&gt;There are x songs in the queue with a total length of x:y:z.&lt;/td&gt; &lt;/tr&gt; &lt;/tbody&gt; &lt;/table&gt; </code></pre> <p>Whenever I try to run this script function, it aborts with <code>TypeError: find() takes no keyword arguments</code> on the line <code>header_append = th.find(text=True)</code>. I'm sort of stumped, as it seems that I'm doing what is shown in code examples and it seems it should work, yet it doesn't.</p> <p>In short, how do I fix the code so that there is no TypeError and what am I doing wrong?</p> <p>Edit: Articles and documentation that I referred to when writing the script:</p> <ul> <li><a href="http://segfault.in/2010/07/parsing-html-table-in-python-with-beautifulsoup/" rel="nofollow">http://segfault.in/2010/07/parsing-html-table-in-python-with-beautifulsoup/</a></li> <li><a href="http://oneau.wordpress.com/2010/05/30/simple-formatted-tables-in-python-with-texttable/" rel="nofollow">http://oneau.wordpress.com/2010/05/30/simple-formatted-tables-in-python-with-texttable/</a></li> </ul>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload