Note that there are some explanatory texts on larger screens.

plurals
  1. POMultithreading or Parallel processing in PHP
    primarykey
    data
    text
    <p>I'm dealing with Godaddy auction domains, they provide some way to download domains listing. I do have a cron job developed to download &amp; dump (insert) domains listing into my database table. This process takes few seconds from download and dumping into database. The total number of domains (records) in this case are 34000 entries.</p> <p>Second, I need to update the page rank for each individual domain in database for total 34000 records. I have the PHP API for fetching the page rank live. The Godaddy downloads don't provide page rank detail so I have to fetch and update it separately. </p> <p>Now, the problem is when it comes to fetching page rank live and then updating page rank into database takes too much time for total 34000 domains.</p> <p>I recently did an experiment via cron job to update page rank for domains in database, it took 4 hours to update page rank just for 13383 domains from 34000 total. Since it has to first fetch and then update into database. This all was going on dedicated server. </p> <p>Is there any way to speed up this process for large number of domains? The only way, I'm thinking is to accomplish this via multitasking. </p> <p>Would that be possible to have 100 tasks fetching page rank and updating it into database simultaneously?</p> <p>In case you need the code:</p> <pre><code>$sql = "SELECT domain from auctions"; $mozi_get=runQuery($sql); while($results = mysql_fetch_array($mozi_get)){ /* PAGERANK API*/ if($results['domain']!='Featured Listings'){ //echo $results['domain']."&lt;br /&gt;"; try { $url = new SEOstats("http://www.".trim($results['domain'])); $rank=$url-&gt;Google_Page_Rank(); if(!is_integer($rank)){ //$rank='0'; } } catch (SEOstatsException $e) { $rank='0'; } try { $url = new SEOstats(trim("http://".$results['domain'])); $rank_non=$url-&gt;Google_Page_Rank(); if(!is_integer($rank_non)){ //$rank_non='0'; } } catch (SEOstatsException $e) { $rank_non='0'; } $sql = "UPDATE auctions set rank='".$rank."', rank_non='".$rank_non."' WHERE domain='".$results['domain']."'"; runQuery($sql); echo $sql."&lt;br /&gt;"; } } </code></pre> <p>Here is my updated code for pthreads:</p> <pre><code>&lt;?php set_time_limit(0); require_once("database.php"); include 'src/class.seostats.php'; function get_page_rank($domain) { try { $url = new SEOstats("http://www." . trim($domain)); $rank = $url-&gt;Google_Page_Rank(); if(!is_integer($rank)){ $rank = '0'; } } catch (SEOstatsException $e) { $rank = '0'; } return $rank; } class Ranking extends Worker { public function run(){} } class Domain extends Stackable { public $name; public $ranking; public function __construct($name) { $this-&gt;name = $name; } public function run() { $this-&gt;ranking = get_page_rank($this-&gt;name); /* now write the Domain to database or whatever */ $sql = "UPDATE auctions set rank = '" . $this-&gt;ranking . "' WHERE domain = '" . $this-&gt;name . "'"; runQuery($sql); } } /* start some workers */ $workers = array(); while (@$worker++ &lt; 8) { $workers[$worker] = new Ranking(); $workers[$worker]-&gt;start(); } /* select auctions and start processing */ $domains = array(); $sql = "SELECT domain from auctions"; // RETURNS 55369 RECORDS $domain_result = runQuery($sql); while($results = mysql_fetch_array($domain_result)) { $domains[$results['domain']] = new Domain($results['domain']); $workers[array_rand($workers)]-&gt;stack($domains[$results['domain']]); } /* shutdown all workers (forcing all processing to finish) */ foreach ($workers as $worker) $worker-&gt;shutdown(); /* we now have ranked domains in memory and database */ var_dump($domains); var_dump(count($domains)); ?&gt; </code></pre> <p>Any help will be highly appreciated. Thanks</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload