StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
primarykey
Id
6593158
data
AcceptedAnswerId
0
AnswerCount
0
ClosedDate
CommentCount
3
CommunityOwnedDate
CreationDate
2011-07-06T07:51:01.420
FavoriteCount
0
LastActivityDate
2011-07-06T07:51:01.420
LastEditDate
LastEditorUserId
0
OwnerUserId
248296
ParentId
6591255
PostTypeId
2
Score
10
ViewCount
0
LastEditorDisplayName
text
Body
<p>CrawlSpider inherits BaseSpider. It just added rules to extract and follow links. If these rules are not enough flexible for you - use BaseSpider:</p> <pre><code>class USpider(BaseSpider): """my spider. """ start_urls = ['http://www.amazon.com/s/?url=search-alias%3Dapparel&sort=relevance-fs-browse-rank'] allowed_domains = ['amazon.com'] def parse(self, response): '''Parse main category search page and extract subcategory search link.''' self.log('Downloaded category search page.', log.DEBUG) if response.meta['depth'] > 5: self.log('Categories depth limit reached (recursive links?). Stopping further following.', log.WARNING) hxs = HtmlXPathSelector(response) subcategories = hxs.select("//div[@id='refinements']/*[starts-with(.,'Department')]/following-sibling::ul[1]/li/a[span[@class='refinementLink']]/@href").extract() for subcategory in subcategories: subcategorySearchLink = urlparse.urljoin(response.url, subcategorySearchLink) yield Request(subcategorySearchLink, callback = self.parseSubcategory) def parseSubcategory(self, response): '''Parse subcategory search page and extract item links.''' hxs = HtmlXPathSelector(response) for itemLink in hxs.select('//a[@class="title"]/@href').extract(): itemLink = urlparse.urljoin(response.url, itemLink) self.log('Requesting item page: ' + itemLink, log.DEBUG) yield Request(itemLink, callback = self.parseItem) try: nextPageLink = hxs.select("//a[@id='pagnNextLink']/@href").extract()[0] nextPageLink = urlparse.urljoin(response.url, nextPageLink) self.log('\nGoing to next search page: ' + nextPageLink + '\n', log.DEBUG) yield Request(nextPageLink, callback = self.parseSubcategory) except: self.log('Whole category parsed: ' + categoryPath, log.DEBUG) def parseItem(self, response): '''Parse item page and extract product info.''' hxs = HtmlXPathSelector(response) item = UItem() item['brand'] = self.extractText("//div[@class='buying']/span[1]/a[1]", hxs) item['title'] = self.extractText("//span[@id='btAsinTitle']", hxs) ... </code></pre> <p>Even if BaseSpider's start_urls are not enough flexible for you, override <a href="http://doc.scrapy.org/topics/spiders.html#scrapy.spider.BaseSpider.start_requests" rel="noreferrer">start_requests</a> method.</p>
Tags
Title
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. POFollowing links, Scrapy web crawler framework
  singulars
  PostTypePostTypeId
  PTQuestion
PostTypePostTypeId
1. PTAnswer
UserLastEditorUserId
1. This table or related slice is empty.
UserOwnerUserId
1. USwarvariuc
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. POFollowing links, Scrapy web crawler framework
  singulars
  PostTypePostTypeId
  PTQuestion
PostsParentIdCreationDate
1. This table or related slice is empty.
VotesPostIdCreationDate
1. VO
  singulars
  PostPostId
  PO
  UserUserId
  This table or related slice is empty.
  VoteTypeVoteTypeId
  VTAcceptedByOriginator
2. VO
  singulars
  PostPostId
  PO
  UserUserId
  This table or related slice is empty.
  VoteTypeVoteTypeId
  VTUpMod
3. VO
  singulars
  PostPostId
  PO
  UserUserId
  This table or related slice is empty.
  VoteTypeVoteTypeId
  VTUpMod
CommentsPostId

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.