StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
primarykey
Id
5453240
data
AcceptedAnswerId
0
AnswerCount
0
ClosedDate
CommentCount
3
CommunityOwnedDate
2011-03-28T01:30:41.233
CreationDate
2011-03-27T23:33:17.687
FavoriteCount
0
LastActivityDate
2011-03-28T01:36:20.997
LastEditDate
2011-03-28T01:36:20.997
LastEditorUserId
667301
OwnerUserId
667301
ParentId
5216128
PostTypeId
2
Score
4
ViewCount
0
LastEditorDisplayName
text
Body
First, I'll just share some verbiage I noticed on the Netflix site under Limitations on Use: Any unauthorized use of the Netflix service or its contents will terminate the limited license granted by us and will result in the cancellation of your membership. In short, I'm not sure what your script does after this, but some activities could jeopardize your relationship with Netflix. I did not read the whole ToS, but you should. That said, there are plenty of legitimate reasons to scrape html information, and I do it all the time. So my first bet with this specific problem is you're using the wrong detection string... Just send a bogus email/password and print the response... Perhaps you made an assumption about what it looks like when you log in with a browser, but the browser is sending info that gets further into the process. I wish I could offer specifics on what to do next, but I would rather not risk my relationship with 'flix to give a better answer to the question... so I'll just share a few observations I gleaned from scraping oodles of other websites that made it kindof hard to use web robots... First, login to your account with Firefox, and be sure to have the <a href="https://addons.mozilla.org/en-us/firefox/addon/live-http-headers/" rel="nofollow">Live HTTP Headers</a> add-on enabled and in capture mode... what you will see when you login live is invaluable to your scripting efforts... for instance, this was from a session while I logged in... <pre> POST /Login HTTP/1.1 Host: signup.netflix.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.16) Gecko/20110319 Firefox/3.6.16 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 115 Connection: keep-alive Referer: https://signup.netflix.com/Login?country=1&rdirfdc=true --->Insert lots of private stuff here Content-Type: application/x-www-form-urlencoded Content-Length: 168 authURL=sOmELoNgTeXtStRiNg&nextpage=&SubmitButton=true&country=1&email=EmAiLAdDrEsS%40sOmEMaIlProvider.com&password=UnEnCoDeDpAsSwOrD </pre> Pay particular attention to the stuff below "Content-Length" field and all the parameters that come after it. Now log back out, and pull up the login site page again... chances are, you will see some of those fields hidden as state information in <code><input type="hidden"></code> tags... some web apps keep state by feeding you fields and then they use javascript to resubmit that same information in your login POST. I usually use lxml to parse the pages I receive... if you try it, keep in mind that lxml prefers utf-8, so I include code that automagically converts when it sees other encodings... <pre><code> response = urlopen(req,data) # info is from the HTTP headers... like server version info = response.info().dict # page is the HTML response page = response.read() encoding = chardet.detect(page)['encoding'] if encoding != 'utf-8': page = page.decode(encoding, 'replace').encode('utf-8') </code></pre> BTW, <a href="http://www.voidspace.org.uk/python/articles/urllib2.shtml" rel="nofollow">Michael Foord</a> has a very good reference on urllib2 and many of the assorted issues. So, in summary: <ol> <li>Using your existing script, dump the results from a known bogus login to be sure you're parsing for the right info... I'm pretty sure you made a bad assumption above</li> <li>It also looks like you aren't submitting enough parameters in the POST. Experience tells me you need to set <code>authURL</code> in addition to <code>email</code> and <code>password</code>... if possible, I try to mimic what the browser sends...</li> <li>Occasionally, it matters whether you have set your user-agent string and referring webpage. I always set these when I scrape so I don't waste cycles debugging.</li> <li>When all else fails, look at info stored in cookies they send</li> <li>Sometimes websites base64 encode form submission data. I don't know whether Netflix does</li> <li>Some websites are very protective of their intellectual property, and programatically reading/archiving the information is considered a theft of their IP. Again, read the ToS... I don't know how Netflix views what you want to do.</li> <li>I am providing this for informational purposes and under no circumstances endorse, or condone the violation of Netflix terms of service... nor can I confirm whether your proposed activity would... I'm just saying it might :-). Talk to a lawyer that specializes in e-discovery if you need an official ruling. Feet first. Don't eat yellow snow... etc...</li> </ol>
Tags
Title
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. POlog into website (specifically netflix) with python
 singulars
 PostTypePostTypeId
 PTQuestion
PostTypePostTypeId
1. PTAnswer
UserLastEditorUserId
1. USMike Pennington
UserOwnerUserId
1. USMike Pennington
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. This table or related slice is empty.
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
2. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
3. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
CommentsPostId

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.