Note that there are some explanatory texts on larger screens.

plurals
  1. POMySQL inserting new rows even after the function has stopped
    primarykey
    data
    text
    <p><strong>Since the functions are very long, you can actually skip the function as they only extract specific data from HTML tags please try to help by providing logic to a problem on this scale</strong> I am building a website which can dig into a specific website and extract nearly the full database. The actual function is very long and is inserting into 4 different tables with an estimated combined 60k rows. My problem is even after the function has finished executing, I can see new rows being added into my database. I am guessing the MySQL insert speed is lower than my iteration speed. </p> <p><strong>Controller</strong></p> <pre><code>public function start() { // set the loop running for ever ini_set('MAX_EXECUTION_TIME', -1); set_time_limit(0); $lastId = $this-&gt;Kpkt_model-&gt;last_temp_id(); if($lastId == 0 OR $lastId == '' OR $lastId == '0') { $lastId = 0; } else { $lastId = $this-&gt;Kpkt_model-&gt;last_temp_id(); } // add 1 to start the process $i = $lastId+1; // set the errors to 0 $errors = 0; while($errors &lt; 25) { usleep(100); if($this-&gt;_scrap_all($i) == 'empty') { // its an empty record, add 1 to the errors $errors++; } else { $errors = 0; } $i++; } } function _scrap_all($i) { $url = "mydesiredwebsite.com?PMJU_KOD=$i"; $html = file_get_html($url); // then check if the html element exists to avoid trying to parse non-html $tag = $html-&gt;find('td[class=tdSecondtext1]'); // now we need to remove all the redundant spaces $tag = preg_replace("/[[:blank:]]+/"," ",$tag); // lets sanitize the dirty string from the html special characters $tag = preg_replace("/&amp;#?[a-z0-9]{2,8};/i","",$tag); $name = strip_tags(str_replace("\n","", str_replace("\r","", $tag[0]))); $file = strip_tags(str_replace("\n","", str_replace("\r","", $tag[1]))); $roc = strip_tags(str_replace("\n","", str_replace("\r","", $tag[2]))); $address = strip_tags(str_replace("\n","", str_replace("\r","", $tag[3]))); $city = strip_tags(str_replace("\n","", str_replace("\r","", $tag[4]))); $postcode = strip_tags(str_replace("\n","", str_replace("\r","", $tag[5]))); $district = strip_tags(str_replace("\n","", str_replace("\r","", $tag[6]))); $state = strip_tags(str_replace("\n","", str_replace("\r","", $tag[7]))); $telephone = strip_tags(str_replace("\n","", str_replace("\r","", $tag[8]))); $fax = strip_tags(str_replace("\n","", str_replace("\r","", $tag[9]))); $website = strip_tags(str_replace("\n","", str_replace("\r","", $tag[10]))); $last_update = strip_tags(str_replace("\n","", str_replace("\r","", $tag[11]))); // check to see if this array contains data or has returned from the error page if(strlen($name) &lt; 4) { // here we can assume the project is empty by looking at it's file (unique id) return 'empty'; } else { /* The project file is not empty, proceed to add to database we need to convert this number into a date format */ $date = str_replace('/', '-', $last_update); $last_update = date('Y-m-d', strtotime($date)); // begin insertion $this-&gt;Kpkt_model-&gt;insert_company_temp($name, $file, $roc, $address, $city, $postcode, $district, $state, $telephone, $fax, $website, $last_update); // echo "&lt;span style='color:green'&gt;company #&lt;strong&gt;$i&lt;/strong&gt; added to database!&lt;/span&gt;&lt;br/&gt;"; // lets search the details $links = $html-&gt;find('a[href^=DetailProjek.cfm]'); if (sizeof($links &gt; 0)) { foreach($links as $key=&gt;$link) { // eurika! $anchor = $link-&gt;getAttribute ( 'href' ); // change the spaces to html notation $anchor = str_replace(' ', '%20', $anchor); $url = "mydesiredwebsite.com/$anchor"; $html2 = file_get_html($url); $tag = $html2-&gt;find('td[class=tdSecondtext1]'); // now we need to remove all the redundant spaces $tag = preg_replace("/[[:blank:]]+/"," ",$tag); // lets sanitize the dirty string from the html special characters $tag = preg_replace("/&amp;#?[a-z0-9]{2,8};/i","",$tag); // this is our foreign key $developer = strip_tags(str_replace("\n","", str_replace("\r","", $tag[1]))); // first batch $name = strip_tags(str_replace("\n","", str_replace("\r","", $tag[12]))); $file = strip_tags(str_replace("\n","", str_replace("\r","", $tag[13]))); $lot_no = strip_tags(str_replace("\n","", str_replace("\r","", $tag[14]))); $state = strip_tags(str_replace("\n","", str_replace("\r","", $tag[15]))); $housing_no = strip_tags(str_replace("\n","", str_replace("\r","", $tag[16]))); $bank_name = strip_tags(str_replace("\n","", str_replace("\r","", $tag[17]))); $license_no = strip_tags(str_replace("\n","", str_replace("\r","", $tag[18]))); $license_expire = strip_tags(str_replace("\n","", str_replace("\r","", $tag[19]))); $permit_no = strip_tags(str_replace("\n","", str_replace("\r","", $tag[20]))); $permit_expire = strip_tags(str_replace("\n","", str_replace("\r","", $tag[21]))); $land_status = strip_tags(str_replace("\n","", str_replace("\r","", $tag[22]))); $date1 = str_replace('/', '-', $license_expire); $license_expire = date('Y-m-d', strtotime($date1)); $date2 = str_replace('/', '-', $permit_expire); $permit_expire = date('Y-m-d', strtotime($date2)); $this-&gt;Kpkt_model-&gt;add_project_information_temp( $developer, $name, $file, $lot_no, $state, $housing_no, $bank_name, $license_no, $license_expire, $permit_no, $permit_expire, $land_status ); $project_id = $this-&gt;Kpkt_model-&gt;get_last_id(); /* delete the first 23 items of the array since we dont know how many rows of data are under here, we deduct the company details and primary project details sizeof(array) - 23 / 12 = the number of rows we need to insert MD 30/01/2013 */ $newTags = array_slice($tag, 23); // separate them into batches of 12 $newTags = array_chunk($newTags, 12); // now we iterate through the array and add the details in foreach($newTags AS $tag) { $category = strip_tags(str_replace("\n","", str_replace("\r","", $tag[0]))); $type = strip_tags(str_replace("\n","", str_replace("\r","", $tag[1]))); $storey = strip_tags(str_replace("\n","", str_replace("\r","", $tag[2]))); $floorArea = strip_tags(str_replace("\n","", str_replace("\r","", $tag[3]))); $totalArea = strip_tags(str_replace("\n","", str_replace("\r","", $tag[4]))); $units = strip_tags(str_replace("\n","", str_replace("\r","", $tag[5]))); $tcf = strip_tags(str_replace("\n","", str_replace("\r","", $tag[6]))); $cf = strip_tags(str_replace("\n","", str_replace("\r","", $tag[7]))); $priceMax = strip_tags(str_replace("\n","", str_replace("\r","", $tag[8]))); $priceStandard = strip_tags(str_replace("\n","", str_replace("\r","", $tag[9]))); $priceMin = strip_tags(str_replace("\n","", str_replace("\r","", $tag[10]))); $progressReport = strip_tags(str_replace("\n","", str_replace("\r","", $tag[11]))); $date1 = str_replace('/', '-', $tcf); $tcf = date('Y-m-d', strtotime($date1)); $date2 = str_replace('/', '-', $cf); $cf = date('Y-m-d', strtotime($date2)); $this-&gt;Kpkt_model-&gt;add_project_development_information_temp( $developer, $project_id, $category, $type, $storey, $floorArea, $totalArea, $units, $tcf, $cf, $priceMax, $priceStandard, $priceMin, $progressReport ); } // clean the memory $sellingInfoLinks = $html2-&gt;find('a[href^=LaporanJualRumah.cfm]'); $html-&gt;clear(); unset($html2); unset($tag); unset($tags); if(sizeof($sellingInfoLinks &gt; 0 )) { foreach($sellingInfoLinks AS $key=&gt;$selling) { // now we sift through the selling information $anchor = $selling-&gt;getAttribute ( 'href' ); // change the spaces to html notation $anchor = str_replace(' ', '%20', $anchor); $url = "mydesiredwebsite.com/$anchor"; $html3 = file_get_html($url); $tag = $html3-&gt;find('tr[bgcolor!=#fc6535] td div font'); // now we need to remove all the redundant spaces $tag = preg_replace("/[[:blank:]]+/"," ",$tag); // lets sanitize the dirty string from the html special characters $tag = preg_replace("/&amp;#?[a-z0-9]{2,8};/i","",$tag); // there are 12 items per array $numRows = sizeof($tag) / 12; $tag = array_chunk($tag, 12); foreach ($tag as $value) { // echo '&lt;pre&gt;'; // print_r($value); // echo '&lt;/pre&gt;'; // we break down the selling information into chunks of 12 to insert into the database, each bunch of 12 is one set of data $company_id = $developer; $project_id = $project_id; $roomType = strip_tags(str_replace("\n","", str_replace("\r","", $value[0]))); $levels = strip_tags(str_replace("\n","", str_replace("\r","", $value[1]))); $local = strip_tags(str_replace("\n","", str_replace("\r","", $value[2]))); $chinse = strip_tags(str_replace("\n","", str_replace("\r","", $value[3]))); $indian = strip_tags(str_replace("\n","", str_replace("\r","", $value[4]))); $other = strip_tags(str_replace("\n","", str_replace("\r","", $value[5]))); $foreign = strip_tags(str_replace("\n","", str_replace("\r","", $value[6]))); $totalSold = strip_tags(str_replace("\n","", str_replace("\r","", $value[7]))); $totalUnsold = strip_tags(str_replace("\n","", str_replace("\r","", $value[8]))); $totalPerUnit = strip_tags(str_replace("\n","", str_replace("\r","", $value[9]))); $approvedUnits = strip_tags(str_replace("\n","", str_replace("\r","", $value[10]))); $developedUnits = strip_tags(str_replace("\n","", str_replace("\r","", $value[11]))); //echo sizeof($totalPerUnit); $this-&gt;Kpkt_model-&gt;add_selling_information_temp( $company_id, $project_id, $roomType, $levels, $local, $chinse, $indian, $other, $foreign, $totalSold, $totalUnsold, $totalPerUnit, $approvedUnits, $developedUnits ); # code... } $html3-&gt;clear(); unset($html3); unset($tag); } } } } } // clean the memory $html-&gt;clear(); unset($html); unset($tag); } </code></pre> <p><strong>Model</strong></p> <pre><code>function insert_company_temp($name, $file, $roc, $address, $city, $postcode, $district, $state, $telephone, $fax, $website, $last_update) { $data = array( 'file' =&gt; $file, 'name' =&gt; $name, 'roc' =&gt; $roc, 'address' =&gt; $address, 'city' =&gt; $city, 'postcode' =&gt; $postcode, 'district' =&gt; $district, 'state' =&gt; $state, 'telephone' =&gt; $telephone, 'fax' =&gt; $fax, 'website' =&gt; $website, 'last_update' =&gt; $last_update ); $this-&gt;db-&gt;insert('kpkt_company_temp', $data); //echo $this-&gt;db-&gt;last_query(); } function last_id() { $query = "SELECT MAX(id) AS id FROM kpkt_company"; $res = $this-&gt;db-&gt;query($query); return $res-&gt;row('id'); //echo $this-&gt;db-&gt;last_query(); } function last_temp_id() { $query = "SELECT MAX(id) AS id FROM kpkt_company_temp"; $res = $this-&gt;db-&gt;query($query); return $res-&gt;row('id'); //echo $this-&gt;db-&gt;last_query(); } function add_project_information_temp( $developer, $name, $file, $lot_no, $state, $housing_no, $bank_name, $license_no, $license_expire, $permit_no, $permit_expire, $land_status ) { $data = array( 'developer_id' =&gt; $developer, 'name' =&gt; $name, 'file' =&gt; $file, 'lot_no' =&gt; $lot_no, 'state' =&gt; $state, 'housing_no' =&gt; $housing_no, 'bank_name' =&gt; $bank_name, 'license' =&gt; $license_no, 'license_expire' =&gt; $license_expire, 'permit_no' =&gt; $permit_no, 'permit_expire' =&gt; $permit_expire, 'land_status' =&gt; $land_status ); $this-&gt;db-&gt;insert('kpkt_project_information_temp', $data); } function add_project_development_information_temp( $developer, $project_id, $category, $type, $storey, $floorArea, $totalArea, $units, $tcf, $cf, $priceMax, $priceStandard, $priceMin, $progressReport ) { $data = array( 'developer_id' =&gt; $developer, 'project_id' =&gt; $project_id, 'house_category' =&gt; $category, 'house_type' =&gt; $type, 'levels' =&gt; $storey, 'floor_area' =&gt; $floorArea, 'total_area' =&gt; $totalArea, 'units' =&gt; $units, 'tcf' =&gt; $tcf, 'cf' =&gt; $cf, 'price_max' =&gt; $priceMax, 'price_standard' =&gt; $priceStandard, 'price_min' =&gt; $priceMin, 'progress_report' =&gt; $progressReport ); $this-&gt;db-&gt;insert('kpkt_project_development_information_temp', $data); } function add_selling_information_temp( $company_id, $project_id, $roomType, $levels, $local, $chinese, $indian, $other, $foreign, $totalSold, $totalUnsold, $totalPerUnit, $approvedUnits, $developedUnits ) { $data = array( 'developer_id' =&gt; $company_id, 'project_id' =&gt; $project_id, 'house_type' =&gt; $roomType, 'levels' =&gt; $levels, 'bumi' =&gt; $local, 'chinese' =&gt; $chinese, 'indian' =&gt; $indian, 'other' =&gt; $other, 'foreigner' =&gt; $foreign, 'units_sold' =&gt; $totalSold, 'units_unsold' =&gt; $totalUnsold, 'price_per_unit'=&gt; $totalPerUnit, 'approved_units'=&gt; $approvedUnits, 'developed_units'=&gt; $developedUnits ); $this-&gt;db-&gt;insert('kpkt_selling_information_temp', $data); } </code></pre> <p>So again just to clarify, everything runs very smoothly with no problems, EXCEPT the <code>INSERT</code> statements keep on falling behind so to say. How can I throttle the <code>while</code> in <code>function start()</code>? Thanks </p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload