StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POHow to scrape the 'More' portion of the Quora profile page?
primarykey
Id
7614478
data
AcceptedAnswerId
0
AnswerCount
1
ClosedDate
CommentCount
1
CommunityOwnedDate
CreationDate
2011-09-30T18:23:38.963
FavoriteCount
1
LastActivityDate
2011-09-30T23:43:59.690
LastEditDate
2011-09-30T18:47:16.903
LastEditorUserId
730403
OwnerUserId
730403
ParentId
0
PostTypeId
1
Score
1
ViewCount
546
LastEditorDisplayName
text
Body
<p>To determine the list of all topics on Quora, I decided to start from scraping the profile page with many topics followed, e.g. <a href="http://www.quora.com/Charlie-Cheever/topics" rel="nofollow">http://www.quora.com/Charlie-Cheever/topics</a>. I scraped the topics from this page, but now I need to scrape the topics from the Ajax page which is loaded when you click on 'More' button at the bottom of the page. I'm trying to find the javascript function executed upon clicking on 'More' button, but no luck yet. Here are three snippets from the html page which may be relevant:</p> <pre><code><div class=\"pager_next action_button\" id=\"__w2_mEaYKRZ_more\">More</div> {\"more_button\": \"mEaYKRZ\"} \"dPs6zd5\": {\"more_button\": \"more_button\"} new(PagedListMoreButton)(\"mEaYKRZ\",\"more_button\",{},\"live:ld_c5OMje_9424:cls:a.view.paged_list:PagedListMoreButton:/TW7WZFZNft72w\",{}) </code></pre> <p>Does anyone of you guys know the name of javascript function executed when clicking on 'More' button? Any help would be appreciated :)</p> <p>The Python script (followed <a href="http://dev.lethain.com/an-introduction-to-compassionate-screenscraping/" rel="nofollow">this</a> tutorial) at this point looks like this:</p> <pre><code>#just prints topics followed by Charlie Cheevers from the 1st page #!/usr/bin/python import httplib2,time,re from BeautifulSoup import BeautifulSoup SCRAPING_CONN = httplib2.Http(".cache") def fetch(url,method="GET"): return SCRAPING_CONN.request(url,method) def extractTopic(s): d = {} d['url'] = "http://www.quora.com" + s['href'] d['topicName'] = s.findChildren()[0].string return d def fetch_stories(): page = fetch(u"http://www.quora.com/Charlie-Cheever/topics") soup = BeautifulSoup(page[1]) stories = soup.findAll('a', 'topic_name') topics = [extractTopic(s) for s in stories] for t in topics: print u"%s, %s\n" % (t['topicName'],t['url']) stories = fetch_stories() </code></pre>
Tags
<python><ajax><screen-scraping><web-scraping>
Title
How to scrape the 'More' portion of the Quora profile page?
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. This table or related slice is empty.
PostTypePostTypeId
1. PTQuestion
UserLastEditorUserId
1. USArman
UserOwnerUserId
1. USArman
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. PO
  singulars
  PostTypePostTypeId
  PTAnswer
VotesPostIdCreationDate
1. VO
  singulars
  PostPostId
  POHow to scrape the 'More' portion of the Quora profile page?
  UserUserId
  USom-nom-nom
  VoteTypeVoteTypeId
  VTFavorite
2. VO
  singulars
  PostPostId
  POHow to scrape the 'More' portion of the Quora profile page?
  UserUserId
  This table or related slice is empty.
  VoteTypeVoteTypeId
  VTUpMod
CommentsPostId
1. COHi Aman, I'm working on something similar. Did you find a solution?
  singulars
  PostPostId
  POHow to scrape the 'More' portion of the Quora profile page?
  UserUserId
  USProgramming Noob

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.