StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
primarykey
Id
103554
data
AcceptedAnswerId
0
AnswerCount
0
ClosedDate
CommentCount
6
CommunityOwnedDate
CreationDate
2008-09-19T16:40:07.803
FavoriteCount
0
LastActivityDate
2017-09-10T10:58:48.257
LastEditDate
2017-09-10T10:58:48.257
LastEditorUserId
1624960
OwnerUserId
16448
ParentId
26947
PostTypeId
2
Score
48
ViewCount
0
LastEditorDisplayName
tyshock
text
Body
Scraping generally encompasses 3 steps: <ul> <li>first you GET or POST your request to a specified URL </li> <li>next you receive the html that is returned as the response</li> <li>finally you parse out of that html the text you'd like to scrape.</li> </ul> To accomplish steps 1 and 2, below is a simple php class which uses Curl to fetch webpages using either GET or POST. After you get the HTML back, you just use Regular Expressions to accomplish step 3 by parsing out the text you'd like to scrape. For regular expressions, my favorite tutorial site is the following: <a href="http://www.regular-expressions.info/" rel="nofollow noreferrer">Regular Expressions Tutorial</a> My Favorite program for working with RegExs is <a href="http://www.regexbuddy.com/" rel="nofollow noreferrer">Regex Buddy</a>. I would advise you to try the demo of that product even if you have no intention of buying it. It is an invaluable tool and will even generate code for your regexs you make in your language of choice (including php). Usage: <pre><code> $curl = new Curl(); $html = $curl->get("<a href="http://www.google.com" rel="nofollow noreferrer">http://www.google.com</a>"); // now, do your regex work against $html </pre></code> PHP Class: <pre><code> <?php class Curl { public $cookieJar = ""; public function __construct($cookieJarFile = 'cookies.txt') { $this->cookieJar = $cookieJarFile; } function setup() { $header = array(); $header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,"; $header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5"; $header[] = "Cache-Control: max-age=0"; $header[] = "Connection: keep-alive"; $header[] = "Keep-Alive: 300"; $header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7"; $header[] = "Accept-Language: en-us,en;q=0.5"; $header[] = "Pragma: "; // browsers keep this blank. curl_setopt($this->curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.8.1.7) Gecko/20070914 Firefox/2.0.0.7'); curl_setopt($this->curl, CURLOPT_HTTPHEADER, $header); curl_setopt($this->curl,CURLOPT_COOKIEJAR, $this->cookieJar); curl_setopt($this->curl,CURLOPT_COOKIEFILE, $this->cookieJar); curl_setopt($this->curl,CURLOPT_AUTOREFERER, true); curl_setopt($this->curl,CURLOPT_FOLLOWLOCATION, true); curl_setopt($this->curl,CURLOPT_RETURNTRANSFER, true); } function get($url) { $this->curl = curl_init($url); $this->setup(); return $this->request(); } function getAll($reg,$str) { preg_match_all($reg,$str,$matches); return $matches[1]; } function postForm($url, $fields, $referer='') { $this->curl = curl_init($url); $this->setup(); curl_setopt($this->curl, CURLOPT_URL, $url); curl_setopt($this->curl, CURLOPT_POST, 1); curl_setopt($this->curl, CURLOPT_REFERER, $referer); curl_setopt($this->curl, CURLOPT_POSTFIELDS, $fields); return $this->request(); } function getInfo($info) { $info = ($info == 'lasturl') ? curl_getinfo($this->curl, CURLINFO_EFFECTIVE_URL) : curl_getinfo($this->curl, $info); return $info; } function request() { return curl_exec($this->curl); } } ?> </code> </pre>
Tags
Title
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. POHow to implement a web scraper in PHP?
 singulars
 PostTypePostTypeId
 PTQuestion
PostTypePostTypeId
1. PTAnswer
UserLastEditorUserId
1. USZevi Sternlicht
UserOwnerUserId
1. UStyshock
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. This table or related slice is empty.
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
2. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
3. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
CommentsPostId

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.