Note that there are some explanatory texts on larger screens.

plurals
  1. POPython Web Scraping; Beautiful Soup
    primarykey
    data
    text
    <p>This was covered in this post: <a href="https://stackoverflow.com/questions/1391657/python-web-scraping-involving-html-tags-with-attributes">Python web scraping involving HTML tags with attributes</a></p> <p>But I haven't been able to do something similar for this web page: <a href="http://www.expatistan.com/cost-of-living/comparison/melbourne/auckland" rel="nofollow noreferrer">http://www.expatistan.com/cost-of-living/comparison/melbourne/auckland</a>?</p> <p>I'm trying to scrape the values of:</p> <pre><code> &lt;td class="price city-2"&gt; NZ$15.62 &lt;span style="white-space:nowrap;"&gt;(AU$12.10)&lt;/span&gt; &lt;/td&gt; &lt;td class="price city-1"&gt; AU$15.82 &lt;/td&gt; </code></pre> <p>Basically price city-2 and price city-1 (NZ$15.62 and AU$15.82)</p> <p>Currently have:</p> <pre><code>import urllib2 from BeautifulSoup import BeautifulSoup url = "http://www.expatistan.com/cost-of-living/comparison/melbourne/auckland?" page = urllib2.urlopen(url) soup = BeautifulSoup(page) price2 = soup.findAll('td', attrs = {'class':'price city-2'}) price1 = soup.findAll('td', attrs = {'class':'price city-1'}) for price in price2: print price for price in price1: print price </code></pre> <p>Ideally, I'd also like to have comma separated values for:</p> <pre><code>&lt;th colspan="3" class="clickable"&gt;Food&lt;/th&gt;, </code></pre> <p>Extracting 'Food', </p> <pre><code>&lt;td class="item-name"&gt;Daily menu in the business district&lt;/td&gt; </code></pre> <p>Extracting 'Daily menu in the business district'</p> <p>and then the values for price city-2, and price-city1</p> <p>So the printout would be:</p> <p>Food, Daily menu in the business district, NZ$15.62, AU$15.82</p> <p>Thanks!</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload