Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>When I hear someone uses curl_multi_exec it usually turns out they just load it with, say, 100 urls, then wait when all complete, and then process them all, and then start over with the next 100 urls... Blame me, I was doing so too, but then I found out that it is possible to remove/add handles to curl_multi while something is still in progress, And it really saves a lot of time, especially if you reuse already open connections. I wrote a small library to handle queue of requests with callbacks; I'm not posting full version here of course ("small" is still quite a bit of code), but here's a simplified version of the main thing to give you the general idea:</p> <pre><code>public function launch() { $channels = $freeChannels = array_fill(0, $this-&gt;maxConnections, NULL); $activeJobs = array(); $running = 0; do { // pick jobs for free channels: while ( !(empty($freeChannels) || empty($this-&gt;jobQueue)) ) { // take free channel, (re)init curl handle and let // queued object set options $chId = key($freeChannels); if (empty($channels[$chId])) { $channels[$chId] = curl_init(); } $job = array_pop($this-&gt;jobQueue); $job-&gt;init($channels[$chId]); curl_multi_add_handle($this-&gt;master, $channels[$chId]); $activeJobs[$chId] = $job; unset($freeChannels[$chId]); } $pending = count($activeJobs); // launch them: if ($pending &gt; 0) { while(($mrc = curl_multi_exec($this-&gt;master, $running)) == CURLM_CALL_MULTI_PERFORM); // poke it while it wants curl_multi_select($this-&gt;master); // wait for some activity, don't eat CPU while ($running &lt; $pending &amp;&amp; ($info = curl_multi_info_read($this-&gt;master))) { // some connection(s) finished, locate that job and run response handler: $pending--; $chId = array_search($info['handle'], $channels); $content = curl_multi_getcontent($channels[$chId]); curl_multi_remove_handle($this-&gt;master, $channels[$chId]); $freeChannels[$chId] = NULL; // free up this channel if ( !array_key_exists($chId, $activeJobs) ) { // impossible, but... continue; } $activeJobs[$chId]-&gt;onComplete($content); unset($activeJobs[$chId]); } } } while ( ($running &gt; 0 &amp;&amp; $mrc == CURLM_OK) || !empty($this-&gt;jobQueue) ); } </code></pre> <p>In my version $jobs are actually of separate class, not instances of controllers or models. They just handle setting cURL options, parsing response and call a given callback onComplete. With this structure new requests will start as soon as something out of the pool finishes.</p> <p>Of course it doesn't really save you if not just retrieving takes time but processing as well... And it isn't a true parallel handling. But I still hope it helps. :)</p> <p>P.S. did a trick for me. :) Once 8-hour job now completes in 3-4 mintues using a pool of 50 connections. Can't describe that feeling. :) I didn't really expect it to work as planned, because with PHP it rarely works exactly as supposed... That was like "ok, hope it finishes in at least an hour... Wha... Wait... Already?! 8-O"</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload