Note that there are some explanatory texts on larger screens.

plurals
  1. POHow to print only BeautifulSoup values?
    primarykey
    data
    text
    <p>I have built a webscraper with a for-loop. I don't know why, but it returns an url (which is what I want it to return), and then before fetching the next url in the list, it returns a NoneType object. Other than making the script slower, it's not a big deal, if it wasn't because I can't get it to print more than the first url.</p> <pre><code>from BeautifulSoup import BeautifulSoup from mechanize import Browser br = Browser() page = br.open("https://bdkv2.borger.dk/foa/Sider/default.aspx?fk=22&amp;foaid=11541520") html = page.read() soup = BeautifulSoup(html) link = soup.findAll('a') kommunelink = link[21:116] for kommune in kommunelink: kommuneside = br.open(kommune['href']) html2 = kommuneside.read() soup2 = BeautifulSoup(html2) hjemmesidelink = soup2.find('a', id='_uscAncHomesite') print hjemmesidelink['href'] </code></pre> <p>This way my output is like this: </p> <pre><code>http://www.albertslund.dk Traceback (most recent call last): File "C:\Users\kba\Desktop\kommuneskraber.py", line 14, in &lt;module&gt; print hjemmesidelink['href'] TypeError: 'NoneType' object has no attribute '__getitem__' </code></pre> <p>I've tried messing around with stuff like: If variable == specific class, then print, but that doesn't work. Example:</p> <pre><code>If hjemmesidelink['href'] == &lt;class 'BeautifulSoup.Tag'&gt;: print hjemmesidelink['href'] if hjemmesidelink.class == BeautifulSoup.Tag: print hjemmesidelink['href'] </code></pre> <p>Any idea how it should be? Or maybe even better, any idea where/why my script fetches a 'NoneType' object every second time it iterates through the loop? Thanks a bunch.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload