Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p><strong>Overview:</strong></p> <p>All screen scraping first requires manual review of the page you want to extract resources from. When dealing with AJAX you usually just need to analyze a bit more than just simply the HTML. </p> <p>When dealing with AJAX this just means that the value you want is not in the initial HTML document that you requested, but that javascript will be exectued which asks the server for the extra information you want. </p> <p>You can therefore usually simply analyze the javascript and see which request the javascript makes and just call this URL instead from the start. </p> <hr> <p><strong>Example:</strong></p> <p>Take this as an example, assume the page you want to scrape from has the following script:</p> <pre><code>&lt;script type="text/javascript"&gt; function ajaxFunction() { var xmlHttp; try { // Firefox, Opera 8.0+, Safari xmlHttp=new XMLHttpRequest(); } catch (e) { // Internet Explorer try { xmlHttp=new ActiveXObject("Msxml2.XMLHTTP"); } catch (e) { try { xmlHttp=new ActiveXObject("Microsoft.XMLHTTP"); } catch (e) { alert("Your browser does not support AJAX!"); return false; } } } xmlHttp.onreadystatechange=function() { if(xmlHttp.readyState==4) { document.myForm.time.value=xmlHttp.responseText; } } xmlHttp.open("GET","time.asp",true); xmlHttp.send(null); } &lt;/script&gt; </code></pre> <p>Then all you need to do is instead do an HTTP request to time.asp of the same server instead. <a href="http://www.w3schools.com/Ajax/ajax_server.asp" rel="noreferrer">Example from w3schools</a>.</p> <hr> <p><strong>Advanced scraping with C++:</strong> </p> <p>For complex usage, and if you're using C++ you could also consider using the firefox javascript engine <a href="http://www.mozilla.org/js/spidermonkey/" rel="noreferrer">SpiderMonkey</a> to execute the javascript on a page. </p> <p><strong>Advanced scraping with Java:</strong></p> <p>For complex usage, and if you're using Java you could also consider using the firefox javascript engine for Java <a href="http://www.mozilla.org/rhino/" rel="noreferrer">Rhino</a></p> <p><strong>Advanced scraping with .NET:</strong></p> <p>For complex usage, and if you're using .Net you could also consider using the Microsoft.vsa assembly. Recently replaced with ICodeCompiler/CodeDOM.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload