Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>The "show all" Link is a call to a Javascript function on the webpage, and it is called like this:</p> <pre><code>__doPostBack('ctl00$ContentPlaceHolder1$PGN01','') </code></pre> <p>My primary browser is Firefox, and it has excellent add-ons you can use. I used the "Web Developer" add-on.</p> <p>In tab that has the page, then do this: Let the mouse cursor hovering over the show all link. Then right-click and choose "Web Developer" > "Information" > "View Javascript Alt+Shift+J" Firefox will then open a new tab with all the Javascripts that page is using.</p> <p>In the tab - a quick search finds the __doPostBack function, which is coded like this:</p> <pre><code>function __doPostBack(eventTarget, eventArgument) { if (!theForm.onsubmit || (theForm.onsubmit() != false)) { theForm.__EVENTTARGET.value = eventTarget; theForm.__EVENTARGUMENT.value = eventArgument; theForm.submit(); } } </code></pre> <p>That JavaScipt function can be reduced to the JavaScript code below, if the test in the if-statemment evaluates to true, and we use the arguments from the call to to the function:</p> <pre><code>theForm.__EVENTTARGET.value = 'ctl00$ContentPlaceHolder1$PGN01'; theForm.__EVENTARGUMENT.value = ''; theForm.submit(); </code></pre> <p>Now we need to know what 'theForm' is.</p> <p>The code below was found just above the __doPostBack function:</p> <pre><code>var theForm = document.forms['aspnetForm']; if (!theForm) { theForm = document.aspnetForm; } </code></pre> <p>From that we now know is that the 'theForm' is a reference to a 'form' HTML-tag, which has the id attribute 'aspnetForm' (id='aspnetForm'), which means that in the HTML document we shold look for something that begins like this:</p> <pre><code>&lt;form id="aspnetForm" </code></pre> <p>To know how exactly that tag is written I will use the Firefox browser to look at modified HTML of the page you are interested in. <strong>Here I am using the FireBug Add-on.</strong></p> <p>The start-tag of the HTML form looks like this:</p> <pre><code>&lt;form id="aspnetForm" onsubmit="javascript:return WebForm_OnSubmit();" action="/Shoes-All.aspx" method="post" name="aspnetForm"&gt; </code></pre> <p><strong>So the action is using the same HTML document!</strong></p> <p>Let us see what the WebForm_OnSubmit() function is doing: </p> <pre><code>function WebForm_OnSubmit() { if (typeof(ValidatorOnSubmit) == "function" &amp;&amp; ValidatorOnSubmit() == false) return false; return true; </code></pre> <p>}</p> <p>Let us have a look at the ValidatorOnSubmit() function and an interesting variable it is using :</p> <pre><code>var Page_ValidationActive = false; // ... function ValidatorOnSubmit() { if (Page_ValidationActive) { return ValidatorCommonOnSubmit(); } else { return true; } } </code></pre> <p>This means that the function will always return 'true', so we can now rewrite the form start-tag to:</p> <pre><code>&lt;form id="aspnetForm" onsubmit="javascript:return true;" action="/Shoes-All.aspx" method="post" name="aspnetForm"&gt; </code></pre> <p><strong>From that we can conclude that the web page needs to be reloaded from the webserver.</strong></p> <p>Now we need to know which variables and values that are used in the HTTP POST request.</p> <p>To do that I had used Firefox again with JavaScript switched On. <strong>Also, I am using the awesome Firebug Add-On.</strong> In The Form-element in the first div tag we find this: </p> <pre><code>&lt;div&gt; &lt;input id="__EVENTTARGET" type="hidden" value="" name="__EVENTTARGET"&gt; &lt;input id="__EVENTARGUMENT" type="hidden" value="" name="__EVENTARGUMENT"&gt; &lt;input id="__LASTFOCUS" type="hidden" value="" name="__LASTFOCUS"&gt; &lt;input id="__VIEWSTATE" type="hidden" value="(lots of data here)" name="__VIEWSTATE"&gt; &lt;/div&gt; </code></pre> <p>If you you remember our previous reduced javascript code, it modified the value attribute of 2 input-tags in the form, so the HTML had been modfied by that function, so it really is:</p> <pre><code>&lt;div&gt; &lt;input id="__EVENTTARGET" type="hidden" value="ctl00$ContentPlaceHolder1$PGN01" name="__EVENTTARGET"&gt; &lt;input id="__EVENTARGUMENT" type="hidden" value="" name="__EVENTARGUMENT"&gt; &lt;input id="__LASTFOCUS" type="hidden" value="" name="__LASTFOCUS"&gt; &lt;input id="__VIEWSTATE" type="hidden" value="(lots of binary data here)" name="__VIEWSTATE"&gt; &lt;/div&gt; </code></pre> <p>You will of course need to do a verbatim copy of the content of the value attribute of the input-tag with the id "_VIEWSTATE".</p> <p>FYI, Firefox say that the _VIEWSTATE input-tags XPath is:</p> <pre><code> //*[@id="__VIEWSTATE"] </code></pre> <p>... and that its CSS selector is :</p> <pre><code>form#aspnetForm div input#__VIEWSTATE </code></pre> <p>After having downloaded all items from the webserver, you will then nedd to parse the HTML page.</p> <p>The interesting content is embedded deep inside in the form. It is a HTML table.</p> <p>The relevant XPath for the table is:</p> <pre><code>//*[@id="ctl00_ContentPlaceHolder1_dlList"] </code></pre> <p>... and the CSS selector is</p> <pre><code>table#ctl00_ContentPlaceHolder1_dlList </code></pre> <p>It contains a tbody, tr, td, div, and another table (=more bad design - someone needs to learn about table-less design. A HTML table will look horrible on smartphones and tablets.)</p> <p>I think that at this point you should put the Beautiful Soup 4 parser on work. </p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload