Note that there are some explanatory texts on larger screens.

plurals
  1. POcasperjs scrape giving empty webpage respose
    text
    copied!<p>I have written a script for CasperJS which is supposed to read a json file which has a list of sites with details on how to log into, and run the search pages for each keyword.</p> <p>I have tested my code with one site which worked, however for the second site, it logs in but when it attempts to goto a search page it returns an empty page.</p> <p>This is my casper setup:</p> <pre><code>casper = require('casper').create( clientScripts: [ 'jquery.min.js' ] verbose: true logLevel: "debug" ) </code></pre> <p>This is the loop code:</p> <pre><code>casper.start().each sites, (self, site)-&gt; search_results[site.title] = {} self.thenOpen site.login.path, method: "get" data: site.login.getdata , -&gt; @echo "Visiting Login Page: #{site.login.path}" self.then -&gt; self.then -&gt; @evaluate (ufield,user,pfield,pass)-&gt; jQuery(ufield).val(user) jQuery(pfield).val(pass) , ufield: site.login.form.username.element user: site.login.form.username.value pfield: site.login.form.password.element pass: site.login.form.password.value self.then -&gt; @echo "Added formdata" self.then -&gt; @click "#{site.login.form.element} #{site.login.form.submit}" self.then -&gt; @echo "Submitted form" self.then -&gt; @waitUntilVisible site.login.form.wait.element, -&gt; @echo 'Page Loaded' , -&gt; @echo "Login Failed" @exit 1 , max_timeout self.each keys, (self, skey)-&gt; self.thenOpen site.search.path, method: "get" data: site.login.getdata , -&gt; @echo "Visiting Search Page: #{site.search.path}, for #{skey}" self.then -&gt; @echo @page.content @waitUntilVisible "#{site.search.form.element} #{site.search.form.query.element}" , -&gt; @echo "Search Form Visible" , -&gt; @echo "Search Visit Failed" @exit 1 , max_timeout self.then -&gt; @evaluate (key, feild)-&gt; jQuery(feild).val(key) , key: skey feild: site.search.form.query.element self.then -&gt; @echo "Added searchdata" self.then -&gt; @click "#{site.search.form.element} #{site.search.form.submit}" self.then -&gt; @echo "Submited Search" self.then -&gt; @waitUntilVisible site.search.form.wait.element , -&gt; @echo "Search Results Visible" , -&gt; @echo "Search Failed" @exit 1 , max_timeout self.then -&gt; result_links = @evaluate (linktag)-&gt; res = [] res_els = __utils__.findAll(linktag) Array.prototype.forEach.call res_els, (e)-&gt; href = e.getAttribute("href") label = jQuery(e).text() res.push {href: href, label: label} if href? res , linktag: "#{site.search.threads.element} #{site.search.threads.link.element}" self.then -&gt; utils.dump result_links search_results[site.title][skey] = result_links </code></pre> <p>Console output to do with the fail is as follows:</p> <pre class="lang-none prettyprint-override"><code>[debug] [phantom] opening url: http://www.cyclingforums.com, HTTP GET [debug] [phantom] Navigation requested: url=http://www.cyclingforums.com/, type=Other, lock=true, isMainFrame=true [debug] [phantom] url changed to "http://www.cyclingforums.com/" [debug] [phantom] Automatically injected jquery.min.js client side [debug] [phantom] Successfully injected Casper client-side utilities [info] [phantom] Step 50/68 http://www.cyclingforums.com/ (HTTP 200) Visiting Search Page: http://www.cyclingforums.com, for track [info] [phantom] Step 50/68: done in 63268ms. [info] [phantom] Step 51/68 http://www.cyclingforums.com/ (HTTP 200) &lt;html&gt;&lt;head&gt;&lt;/head&gt;&lt;body&gt;&lt;/body&gt;&lt;/html&gt; [info] [phantom] Step 51/68: done in 63352ms. [info] [phantom] Step 52/69 http://www.cyclingforums.com/ (HTTP 200) [info] [phantom] Step 52/69: done in 63452ms. [warning] [phantom] Casper.waitFor() timeout Search Visit Failed </code></pre> <p>Any one with any ideas onto why the page is responding with an empty page, please help</p> <p>Thank you.</p> <p>EDIT: Through testing with other sites, it seems that the base code works, but this one site seems to be causing issues.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload