Note that there are some explanatory texts on larger screens.

plurals
  1. POBeautifulSoup: Finding a specific URL in html and printing
    primarykey
    data
    text
    <p>Ok, so I have this html-page (full of different urls), where I want to grab a single url and print it.</p> <p>The webpage is: <a href="https://bdkv2.borger.dk/foa/Sider/default.aspx?fk=22&amp;foaid=11523251" rel="nofollow">https://bdkv2.borger.dk/foa/Sider/default.aspx?fk=22&amp;foaid=11523251</a></p> <p>I want to print the url www.albertslund.dk</p> <p>It looks like this in the source code:</p> <pre><code>&lt;a href="http://www.albertslund.dk" id="_uscAncHomesite" target="_blank"&gt;&lt;strong&gt;&lt;span id="ctl00_PlaceHolderMain_FormControlHandler1__uscShowDataAuthorityDetails__uscLblHomesite"&gt;http://www.albertslund.dk&lt;/span&gt;&lt;/strong&gt;&lt;/a&gt; </code></pre> <p>When I try to grab it and print it by using it's ID (using BeautifulSoup and Mechanize), it just returns an empty list. I would like to grab the URL using the ID, because I'm scraping a bunch of similar sites, where the things that I want have the same ID.</p> <pre><code>kommuneside = br.open(https://bdkv2.borger.dk/foa/Sider/default.aspx?fk=22&amp;foaid=11523251) html2 = kommuneside.read() soup2 = BeautifulSoup(html2) hjemmesidelink = soup2.findAll('a', attras={'ID':'_uscAncHomesite'}) print hjemmesidelink </code></pre> <p>This returns just an empty list: []</p> <p>If I try like this:</p> <pre><code>print hjemmesidelink['href'] </code></pre> <p>I get: TypeError: list indices must be integers, not str</p> <p>I would've thought, that it was pretty straightforward, but I'm a rookie, and it has bugged me for days now.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload