Note that there are some explanatory texts on larger screens.

plurals
  1. POCurl (php script) downloads incomplete file
    text
    copied!<p>I am using a php script in order to download an xml file from an external url using curl, but I am encountering a problem. Curl sometimes fails to download the complete file. The problem happens even more often when I run the script through my host server using cron.</p> <p>This is the script:</p> <pre><code>&lt;?php header('Content-type:text/html; charset=utf-8'); //initialize downloading xml file tries $xml_dl_attempts = 0; //set filename of output xml file $findex = 0; while(file_exists("xml".$findex.".xml")) { $findex++; } $filename = "xml".$findex.".xml"; //filname for log file $logfilename = "log.txt"; //Open (append) logfile for write. $logfileout = fopen($logfilename, 'a'); fwrite($logfileout, "Starting attempts to download the xml file at ".date("H:i:s Y-m-d")."\r\n"); //Attempt to download xml file 8 times do { //Sleep 3 second before retrying download if($xml_dl_attempts &gt; 0 ) sleep(3); //Increse number of download attempts $xml_dl_attempts++; //Write to logfile fwrite($logfileout, date("H:i:s Y-m-d").": Download attempt number ".$xml_dl_attempts.": "); //Download xml file using curl $ch = curl_init(); $url = 'http://www.opap.gr/web/services/rs/betting/availableBetGames/sport/program/4100/0/sport-1.xml?localeId=el_GR'; curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_HEADER, false); curl_setopt($ch, CURLOPT_BINARYTRANSFER, true); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); set_time_limit(300); curl_setopt($ch, CURLOPT_TIMEOUT, 300); $outfile = fopen($filename, 'w'); if (!$outfile) { exit; } curl_setopt($ch, CURLOPT_FILE, $outfile); if(curl_exec($ch)==false) { fwrite($logfileout, "curl_error: ".curl_error($ch)); } fclose($outfile); curl_close($ch); //Clear errors libxml_use_internal_errors(true); libxml_clear_errors(); //Parse xml file $xml = simplexml_load_file($filename); //Check for errors if($err = libxml_get_last_error()) { fwrite($logfileout, "failed\r\n"); } } while($err !== false &amp;&amp; $xml_dl_attempts &lt; 8); //repeat if xml was not completely downloaded //Check if if(!$err) { fwrite($logfileout, "successfull\r\n"); } fwrite($logfileout, "End.\r\n"); fclose($logfileout); ?&gt; </code></pre> <p>As you can see I check if simplexml parser gives an error while parsing the downloaded xml file. If an error occurs then I repeat the process, with limit of 8 attempts. I also created a log file.</p> <p>Here is the log file for a whole day:</p> <pre><code>Starting attempts to download the xml file at 18:35:00 2012-09-25 18:35:00 2012-09-25: Download attempt number : failed 18:35:03 2012-09-25: Download attempt number : failed 18:35:07 2012-09-25: Download attempt number : successfull End. Starting attempts to download the xml file at 19:35:00 2012-09-25 19:35:00 2012-09-25: Download attempt number 1: failed 19:35:03 2012-09-25: Download attempt number 2: failed 19:35:06 2012-09-25: Download attempt number 3: failed 19:35:10 2012-09-25: Download attempt number 4: failed 19:35:13 2012-09-25: Download attempt number 5: failed 19:35:16 2012-09-25: Download attempt number 6: failed 19:35:20 2012-09-25: Download attempt number 7: failed 19:35:23 2012-09-25: Download attempt number 8: successfull End. Starting attempts to download the xml file at 20:35:00 2012-09-25 20:35:00 2012-09-25: Download attempt number 1: failed 20:35:04 2012-09-25: Download attempt number 2: failed 20:35:08 2012-09-25: Download attempt number 3: successfull End. Starting attempts to download the xml file at 21:35:00 2012-09-25 21:35:00 2012-09-25: Download attempt number 1: failed 21:35:04 2012-09-25: Download attempt number 2: failed 21:35:07 2012-09-25: Download attempt number 3: failed 21:35:11 2012-09-25: Download attempt number 4: successfull End. Starting attempts to download the xml file at 22:35:00 2012-09-25 22:35:00 2012-09-25: Download attempt number 1: failed 22:35:04 2012-09-25: Download attempt number 2: failed 22:35:07 2012-09-25: Download attempt number 3: successfull End. Starting attempts to download the xml file at 23:35:00 2012-09-25 23:35:00 2012-09-25: Download attempt number 1: failed 23:35:03 2012-09-25: Download attempt number 2: failed 23:35:07 2012-09-25: Download attempt number 3: failed 23:35:10 2012-09-25: Download attempt number 4: failed 23:35:14 2012-09-25: Download attempt number 5: failed 23:35:17 2012-09-25: Download attempt number 6: failed 23:35:21 2012-09-25: Download attempt number 7: successfull End. Starting attempts to download the xml file at 00:35:00 2012-09-26 00:35:00 2012-09-26: Download attempt number 1: successfull End. Starting attempts to download the xml file at 01:35:00 2012-09-26 01:35:00 2012-09-26: Download attempt number 1: failed 01:35:04 2012-09-26: Download attempt number 2: failed 01:35:07 2012-09-26: Download attempt number 3: failed 01:35:11 2012-09-26: Download attempt number 4: failed 01:35:14 2012-09-26: Download attempt number 5: failed 01:35:18 2012-09-26: Download attempt number 6: failed 01:35:21 2012-09-26: Download attempt number 7: failed 01:35:30 2012-09-26: Download attempt number 8: failed End. Starting attempts to download the xml file at 02:35:00 2012-09-26 02:35:00 2012-09-26: Download attempt number 1: failed 02:35:03 2012-09-26: Download attempt number 2: failed 02:35:07 2012-09-26: Download attempt number 3: failed 02:35:10 2012-09-26: Download attempt number 4: failed 02:35:13 2012-09-26: Download attempt number 5: failed 02:35:17 2012-09-26: Download attempt number 6: failed 02:35:20 2012-09-26: Download attempt number 7: failed 02:35:24 2012-09-26: Download attempt number 8: failed End. Starting attempts to download the xml file at 03:35:00 2012-09-26 03:35:00 2012-09-26: Download attempt number 1: failed 03:35:04 2012-09-26: Download attempt number 2: failed 03:35:07 2012-09-26: Download attempt number 3: failed 03:35:10 2012-09-26: Download attempt number 4: failed 03:35:14 2012-09-26: Download attempt number 5: failed 03:35:17 2012-09-26: Download attempt number 6: failed 03:35:21 2012-09-26: Download attempt number 7: failed 03:35:30 2012-09-26: Download attempt number 8: failed End. Starting attempts to download the xml file at 04:35:00 2012-09-26 04:35:00 2012-09-26: Download attempt number 1: failed 04:35:03 2012-09-26: Download attempt number 2: failed 04:35:07 2012-09-26: Download attempt number 3: failed 04:35:10 2012-09-26: Download attempt number 4: failed 04:35:14 2012-09-26: Download attempt number 5: failed 04:35:17 2012-09-26: Download attempt number 6: failed 04:35:21 2012-09-26: Download attempt number 7: failed 04:35:24 2012-09-26: Download attempt number 8: successfull End. Starting attempts to download the xml file at 05:35:00 2012-09-26 05:35:00 2012-09-26: Download attempt number 1: failed 05:35:04 2012-09-26: Download attempt number 2: failed 05:35:08 2012-09-26: Download attempt number 3: failed 05:35:11 2012-09-26: Download attempt number 4: failed 05:35:15 2012-09-26: Download attempt number 5: failed 05:35:18 2012-09-26: Download attempt number 6: failed 05:35:22 2012-09-26: Download attempt number 7: failed 05:35:25 2012-09-26: Download attempt number 8: failed End. Starting attempts to download the xml file at 06:35:00 2012-09-26 06:35:00 2012-09-26: Download attempt number 1: failed 06:35:03 2012-09-26: Download attempt number 2: failed 06:35:07 2012-09-26: Download attempt number 3: failed 06:35:10 2012-09-26: Download attempt number 4: failed 06:35:14 2012-09-26: Download attempt number 5: failed 06:35:17 2012-09-26: Download attempt number 6: failed 06:35:21 2012-09-26: Download attempt number 7: failed 06:35:24 2012-09-26: Download attempt number 8: failed End. Starting attempts to download the xml file at 07:35:00 2012-09-26 07:35:00 2012-09-26: Download attempt number 1: failed 07:35:04 2012-09-26: Download attempt number 2: failed 07:35:07 2012-09-26: Download attempt number 3: failed 07:35:11 2012-09-26: Download attempt number 4: failed 07:35:14 2012-09-26: Download attempt number 5: failed 07:35:18 2012-09-26: Download attempt number 6: failed 07:35:21 2012-09-26: Download attempt number 7: failed 07:35:24 2012-09-26: Download attempt number 8: failed End. Starting attempts to download the xml file at 08:35:00 2012-09-26 08:35:00 2012-09-26: Download attempt number 1: failed 08:35:03 2012-09-26: Download attempt number 2: failed 08:35:06 2012-09-26: Download attempt number 3: failed 08:35:10 2012-09-26: Download attempt number 4: failed 08:35:13 2012-09-26: Download attempt number 5: failed 08:35:16 2012-09-26: Download attempt number 6: failed 08:35:20 2012-09-26: Download attempt number 7: failed 08:35:23 2012-09-26: Download attempt number 8: failed End. Starting attempts to download the xml file at 09:35:00 2012-09-26 09:35:00 2012-09-26: Download attempt number 1: failed 09:35:04 2012-09-26: Download attempt number 2: failed 09:35:07 2012-09-26: Download attempt number 3: successfull End. Starting attempts to download the xml file at 10:35:00 2012-09-26 10:35:00 2012-09-26: Download attempt number 1: failed 10:35:03 2012-09-26: Download attempt number 2: failed 10:35:06 2012-09-26: Download attempt number 3: failed 10:35:10 2012-09-26: Download attempt number 4: failed 10:35:13 2012-09-26: Download attempt number 5: failed 10:35:17 2012-09-26: Download attempt number 6: failed 10:35:20 2012-09-26: Download attempt number 7: successfull End. Starting attempts to download the xml file at 11:35:00 2012-09-26 11:35:00 2012-09-26: Download attempt number 1: failed 11:35:03 2012-09-26: Download attempt number 2: failed 11:35:07 2012-09-26: Download attempt number 3: successfull End. Starting attempts to download the xml file at 12:35:00 2012-09-26 12:35:00 2012-09-26: Download attempt number 1: failed 12:35:04 2012-09-26: Download attempt number 2: failed 12:35:07 2012-09-26: Download attempt number 3: failed 12:35:11 2012-09-26: Download attempt number 4: failed 12:35:14 2012-09-26: Download attempt number 5: failed 12:35:17 2012-09-26: Download attempt number 6: failed 12:35:21 2012-09-26: Download attempt number 7: successfull End. Starting attempts to download the xml file at 13:35:00 2012-09-26 13:35:00 2012-09-26: Download attempt number 1: failed 13:35:03 2012-09-26: Download attempt number 2: successfull End. Starting attempts to download the xml file at 14:35:00 2012-09-26 14:35:00 2012-09-26: Download attempt number 1: failed 14:35:03 2012-09-26: Download attempt number 2: failed 14:35:07 2012-09-26: Download attempt number 3: failed 14:35:10 2012-09-26: Download attempt number 4: successfull End. Starting attempts to download the xml file at 15:35:00 2012-09-26 15:35:00 2012-09-26: Download attempt number 1: failed 15:35:03 2012-09-26: Download attempt number 2: failed 15:35:07 2012-09-26: Download attempt number 3: failed 15:35:10 2012-09-26: Download attempt number 4: failed 15:35:13 2012-09-26: Download attempt number 5: failed 15:35:17 2012-09-26: Download attempt number 6: failed 15:35:20 2012-09-26: Download attempt number 7: failed 15:35:24 2012-09-26: Download attempt number 8: failed End. Starting attempts to download the xml file at 16:35:00 2012-09-26 16:35:00 2012-09-26: Download attempt number 1: failed 16:35:03 2012-09-26: Download attempt number 2: failed 16:35:07 2012-09-26: Download attempt number 3: successfull End. </code></pre> <p>The thing is that sometimes it manages to get the complete file after some attempts, other times fails completely. Another thing to pay attention is that the curl_exec doesn't return an error when the xml is incomplete.</p> <p>Unfortunately the server that has the xml doesn't support range, so I cannot just resume the file when it is incompletely. I could increase the limit of attempts, let's say to 50, but the thing is that in a failed attempt the script still downloads some data, so for an 1MB xml file, if it fails 30 times downloading 500KB per time, it would have downloaded 16 MB of data for a successfull attempt. I want to run this script every hour, so I believe this is going to hurt my server's bandwidth.</p> <p>Why curl fails to download the complete file. Are there some options so that I make it behave like a browser, who eventually always gets the file?</p> <p>Thanks.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload