Note that there are some explanatory texts on larger screens.

plurals
  1. POOptimising PHP cURL based link checker script - currently very slow
    primarykey
    data
    text
    <p>I'm using a PHP script (using cURL) to check whether:</p> <ul> <li>The links in my database are correct (ie return HTTP status 200)</li> <li>The links are in fact redirected and redirect to an appropriate/similar page (using the contents of the page )</li> </ul> <p>The results of this are saved to a log file and emailed to me as an attachment.</p> <p>This is all fine and working, however it is slow as all hell and half the time it times out and aborts itself early. Of note, I have about 16,000 links to check.</p> <p>Was wondering how best to make this run quicker, and what I'm doing wrong?</p> <p>Code below:</p> <pre><code>function echoappend ($file,$tobewritten) { fwrite($file,$tobewritten); echo $tobewritten; } error_reporting(E_ALL); ini_set('display_errors', '1'); $filename=date('YmdHis') . "linkcheck.htm"; echo $filename; $file = fopen($filename,"w+"); try { $conn = new PDO('mysql:host=localhost;dbname=databasename',$un,$pw); $conn-&gt;setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION); echo '&lt;b&gt;connected to db&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;'; $sitearray = array("medical.posterous","ebm.posterous","behavenet","guidance.nice","www.rch","emedicine","www.chw","www.rxlist","www.cks.nhs.uk"); foreach ($sitearray as $key =&gt; $value) { $site=$value; echoappend ($file, "&lt;h1&gt;" . $site . "&lt;/h1&gt;"); $q="SELECT * FROM link WHERE url LIKE :site"; $stmt = $conn-&gt;prepare($q); $stmt-&gt;execute(array(':site' =&gt; 'http://' . $site . '%')); $result = $stmt-&gt;fetchAll(); $totallinks = 0; $workinglinks = 0; foreach($result as $row) { $ch = curl_init(); $originalurl = $row['url']; curl_setopt($ch, CURLOPT_URL, $originalurl); curl_setopt($ch, CURLOPT_HEADER, 1); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_NOBODY, true); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false); $output = curl_exec($ch); if ($output === FALSE) { echo "cURL Error: " . curl_error($ch); } $urlinfo = curl_getinfo($ch); if ($urlinfo['http_code'] == 200) { echoappend($file, $row['name'] . ": &lt;b&gt;working!&lt;/b&gt;&lt;br /&gt;"); $workinglinks++; } else if ($urlinfo['http_code'] == 301 || 302) { $redirectch = curl_init(); curl_setopt($redirectch, CURLOPT_URL, $originalurl); curl_setopt($redirectch, CURLOPT_HEADER, 1); curl_setopt($redirectch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($redirectch, CURLOPT_NOBODY, false); curl_setopt($redirectch, CURLOPT_FOLLOWLOCATION, true); $redirectoutput = curl_exec($redirectch); $doc = new DOMDocument(); @$doc-&gt;loadHTML($redirectoutput); $nodes = $doc-&gt;getElementsByTagName('title'); $title = $nodes-&gt;item(0)-&gt;nodeValue; echoappend ($file, $row['name'] . ": &lt;b&gt;redirect ... &lt;/b&gt;" . $title . " ... "); if (strpos(strtolower($title),strtolower($row['name']))===false) { echoappend ($file, "FAIL&lt;br /&gt;"); } else { $header = curl_getinfo($redirectch); echoappend ($file, $header['url']); echoappend ($file, "SUCCESS&lt;br /&gt;"); } curl_close($redirectch); } else { echoappend ($file, $row['name'] . ": &lt;b&gt;FAIL code&lt;/b&gt;" . $urlinfo['http_code'] . "&lt;br /&gt;"); } curl_close($ch); $totallinks++; } echoappend ($file, '&lt;br /&gt;'); echoappend ($file, $site . ": " . $workinglinks . "/" . $totallinks . " links working. &lt;br /&gt;&lt;br /&gt;"); } $conn = null; echo '&lt;br /&gt;&lt;b&gt;connection closed&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;'; } catch(PDOException $e) { echo 'ERROR: ' . $e-&gt;getMessage(); } </code></pre>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload