Note that there are some explanatory texts on larger screens.

plurals
  1. POWeb spider on prices
    primarykey
    data
    text
    <p>I would like to compare certain products by price on a couple of websites. So I can create a price history for myself before buying a product. If the price stays stable, I usually order the product, if not I go and ask questions why the price keeps going up and down.</p> <p>I wanted to program a web-crawler for myself using PHP so this is done automatically as it would take allot of time when I do it manually.</p> <p>So I created a MySql database where I input all the URL's from all the products I want to follow. After that I use i simple script to output the prices:</p> <pre><code>&lt;?php @ini_set("output_buffering", "Off"); @ini_set('implicit_flush', 1); @ini_set('zlib.output_compression', 0); @ini_set('max_execution_time', 1200); $dbhost = 'localhost'; $dbuser = 'salesrep'; $dbpass = 'pas'; $dbname = "spider"; $dbtable = "price_compair"; $conn = mysql_connect($dbhost, $dbuser, $dbpass) or die('Error connecting to mysql'); $selected = mysql_select_db($dbname) or die(mysql_error()); $results = mysql_query("SELECT * FROM $dbtable"); mysql_close($conn); while ($row = mysql_fetch_array($results, MYSQL_ASSOC)) { echo "&lt;td&gt;" . $row['artikel'] . "&lt;/td&gt;"; if ($row['url_site1'] == "") { echo "&lt;td&gt;&amp;nbsp;&lt;/td&gt;"; } else { if (!$fp = fopen($row['url_tx3'], "r")) { return false; } $content = ""; while (!feof($fp)) { $content .= fgets($fp, 1024); } fclose($fp); preg_match_all("/\&amp;euro; (\d+\.\d+)/", $content, $pricesite1, PREG_SET_ORDER); $replace1 = array("&amp;euro; "); echo "&lt;td&gt;" . str_replace($replace1, "", $pricesite1[1][0]) . "&lt;/td&gt;"; } if ($row['url_site2'] == "") { echo "&lt;td&gt;&amp;nbsp;&lt;/td&gt;"; } else { if (!$fp = fopen($row['url_tx3shop'], "r")) { return false; } $content = ""; if (ob_get_level() == 0) ob_start(); while (!feof($fp)) { $content .= fgets($fp, 1024); } fclose($fp); preg_match_all("/\d+\.\d+\,\d+|(\d+\,\d+)/", $content, $pricesite2, PREG_SET_ORDER); $replace2 = array("€ ", "."); $out = str_replace(",", ".", str_replace($replace2, "", $pricesite2[1][0])); if ($out == "") { echo "&lt;td&gt;" . str_replace(",", ".", str_replace($replace2, "", $pricesite2[0][0])) . "&lt;/td&gt;"; } else { echo "&lt;td&gt;" . $out . "&lt;/td&gt;"; } } echo "&lt;/tr&gt;"; ob_flush(); flush(); } ?&gt; </code></pre> <p>The main problem I came across was that 1 website uses the € sign, the other one handles &euro; in the code, so finding the price is tricky. Also, the euro sign could be in front or behind the price. And to make it more difficult, there could be a recommended retail price before the actual price.</p> <p>My script works at the moment, but the code I use for the preg_match_all is far from perfect. Is there anyone the has an idea on how to build it up so it practically works perfect on any website?</p> <p>Also, is the fgets statement I use a correct way on building the spider?</p> <p>I know that there are comparison website out there that do this for me, but I find it to be a fun PHP project :)</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload