Note that there are some explanatory texts on larger screens.

plurals
  1. POHow to scrape the 'More' portion of the Quora profile page?
    primarykey
    data
    text
    <p>To determine the list of all topics on Quora, I decided to start from scraping the profile page with many topics followed, e.g. <a href="http://www.quora.com/Charlie-Cheever/topics" rel="nofollow">http://www.quora.com/Charlie-Cheever/topics</a>. I scraped the topics from this page, but now I need to scrape the topics from the Ajax page which is loaded when you click on 'More' button at the bottom of the page. I'm trying to find the javascript function executed upon clicking on 'More' button, but no luck yet. Here are three snippets from the html page which may be relevant:</p> <pre><code>&lt;div class=\"pager_next action_button\" id=\"__w2_mEaYKRZ_more\"&gt;More&lt;/div&gt; {\"more_button\": \"mEaYKRZ\"} \"dPs6zd5\": {\"more_button\": \"more_button\"} new(PagedListMoreButton)(\"mEaYKRZ\",\"more_button\",{},\"live:ld_c5OMje_9424:cls:a.view.paged_list:PagedListMoreButton:/TW7WZFZNft72w\",{}) </code></pre> <p>Does anyone of you guys know the name of javascript function executed when clicking on 'More' button? Any help would be appreciated :)</p> <p>The Python script (followed <a href="http://dev.lethain.com/an-introduction-to-compassionate-screenscraping/" rel="nofollow">this</a> tutorial) at this point looks like this:</p> <pre><code>#just prints topics followed by Charlie Cheevers from the 1st page #!/usr/bin/python import httplib2,time,re from BeautifulSoup import BeautifulSoup SCRAPING_CONN = httplib2.Http(".cache") def fetch(url,method="GET"): return SCRAPING_CONN.request(url,method) def extractTopic(s): d = {} d['url'] = "http://www.quora.com" + s['href'] d['topicName'] = s.findChildren()[0].string return d def fetch_stories(): page = fetch(u"http://www.quora.com/Charlie-Cheever/topics") soup = BeautifulSoup(page[1]) stories = soup.findAll('a', 'topic_name') topics = [extractTopic(s) for s in stories] for t in topics: print u"%s, %s\n" % (t['topicName'],t['url']) stories = fetch_stories() </code></pre>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload