Note that there are some explanatory texts on larger screens.

plurals
  1. POPhantomJS using too many threads
    text
    copied!<p>I wrote a PhantomJS app to crawl over a site I built and check for a JavaScript file to be included. The JavaScript is similar to Google where some inline code loads in another JS file. The app looks for that other JS file which is why I used Phantom.</p> <p><strong>What's the expected result?</strong></p> <p>The console output should read through a ton of URLs and then tell if the script is loaded or not.</p> <p><strong>What's really happening?</strong></p> <p>The console output will read as expected for about 50 requests and then just start spitting out this error:</p> <pre><code>2013-02-21T10:01:23 [FATAL] QEventDispatcherUNIXPrivate(): Can not continue without a thread pipe QEventDispatcherUNIXPrivate(): Unable to create thread pipe: Too many open files </code></pre> <p>This is the block of code that opens a page and searches for the script include:</p> <pre><code>page.open(url, function (status) { console.log(YELLOW, url, status, CLEAR); var found = page.evaluate(function () { if (document.querySelectorAll("script[src='***']").length) { return true; } else { return false; } }); if (found) { console.log(GREEN, 'JavaScript found on', url, CLEAR); } else { console.log(RED, 'JavaScript not found on', url, CLEAR); } self.crawledURLs[url] = true; self.crawlURLs(self.getAllLinks(page), depth-1); }); </code></pre> <p>The crawledURLs object is just an object of urls that I've already crawled. The crawlURLs function just goes through the links from the getAllLinks function and calls the open function on all links that have the base domain of the domain that the crawler started on.</p> <p><strong>Edit</strong></p> <p>I modified the last block of the code to be as follows, but still have the same issue. I have added page.close() to the file.</p> <pre><code>if (!found) { console.log(RED, 'JavaScript not found on', url, CLEAR); } self.crawledURLs[url] = true; var links = self.getAllLinks(page); page.close(); self.crawlURLs(links, depth-1); </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload