Note that there are some explanatory texts on larger screens.

plurals
  1. PORSS Feeds and image extraction indepth
    primarykey
    data
    text
    <p>I have spent time trying to solve this problem and this is as far as ive got. basically im trying to pull images from rss feeds. i use magpie to process the feeds as shown below.. this snippet is within a class</p> <pre><code>function getImagesUrl($str) { $a = array(); $pos = 0; $topos; $init = 1; while($init) { $pos = strpos($str, "img", $pos); if($pos != FALSE) { $topos = strpos($str, "&gt;", $pos); $imagetag = substr($str, $pos, ($topos - $pos)); $url = $this-&gt;getImageUrl($imagetag); $pos = $topos; array_push($a, $url); } else { $init = 0; } } return $a; } /* * get the full url inside src atribute in &lt;img&gt; */ function getImageUrl($image) { $p = strpos($image, "src=", 0); $p+= 5; // remove o src=" $tp = strpos($image, '" ', $p); $str = substr($image, $p, ($tp - $p)); return $str; } </code></pre> <p>using the above functions... i call them this way... so far this outputs the data i'll paste later on</p> <pre><code> @$rss = fetch_rss($rsso-&gt;url); if (@$rss) { $items=$rss-&gt;items; foreach ($items as $item ) { if (isset($item['title'])&amp;&amp;isset($item['description'])) { $hash=md5($this-&gt;es($item['title']).$this-&gt;es($item['description'])); $content = $item['content']; foreach($content as $c) { // get the images on content $arr = $this-&gt;getImagesUrl($c); print_r($arr); } </code></pre> <p>here is an example of output</p> <pre><code> 1. Array ( [0] =&gt; http://api.tweetmeme.com/imagebutton.gif?url=http://mashable.com/2010/09/25/trailmeme/ [1] =&gt; http://cdn.mashable.com/wp-content/plugins/wp-digg-this/i/gbuzz-feed.png [2] =&gt; http://mashable.com/wp-content/plugins/wp-digg-this/i/fb.jpg [3] =&gt; http://mashable.com/wp-content/plugins/wp-digg-this/i/diggme.png [4] =&gt; http://ec.mashable.com/wp-content/uploads/2009/01/bizspark2.gif [5] =&gt; http://cdn.mashable.com/wp-content/uploads/2010/09/web.png [6] =&gt; http://mashable.com/wp-content/uploads/2010/09/Screen-shot-2010-09-24-at-10.51.26-PM.png [7] =&gt; http://cdn.mashable.com/wp-content/uploads/2009/02/bizspark.jpg [8] =&gt; http://feedads.g.doubleclick.net/~at/lxx00QTjYBaYojpnpnTa6MXUmh4/0/di [9] =&gt; [10] =&gt; http://feedads.g.doubleclick.net/~at/lxx00QTjYBaYojpnpnTa6MXUmh4/1/di [11] =&gt; [12] =&gt; http://feeds.feedburner.com/~ff/Mashable?i=0N_mvMwPHYk:j5Pmi_N-JQ8:D7DqB2pKExk [13] =&gt; [14] =&gt; http://feeds.feedburner.com/~ff/Mashable?i=0N_mvMwPHYk:j5Pmi_N-JQ8:V_sGLiPBpWU [15] =&gt; [16] =&gt; http://feeds.feedburner.com/~ff/Mashable?i=0N_mvMwPHYk:j5Pmi_N-JQ8:F7zBnMyn0Lo [17] =&gt; [18] =&gt; http://feeds.feedburner.com/~ff/Mashable?d=qj6IDK7rITs [19] =&gt; [20] =&gt; http://feeds.feedburner.com/~ff/Mashable?d=_e0tkf89iUM [21] =&gt; [22] =&gt; http://feeds.feedburner.com/~ff/Mashable?i=0N_mvMwPHYk:j5Pmi_N-JQ8:gIN9vFwOqvQ [23] =&gt; [24] =&gt; http://feeds.feedburner.com/~ff/Mashable?d=yIl2AUoC8zA [25] =&gt; [26] =&gt; http://feeds.feedburner.com/~ff/Mashable?d=P0ZAIrC63Ok [27] =&gt; [28] =&gt; http://feeds.feedburner.com/~ff/Mashable?d=I9og5sOYxJI [29] =&gt; [30] =&gt; http://feeds.feedburner.com/~ff/Mashable?d=CC-BsrAYo0A [31] =&gt; [32] =&gt; http://feeds.feedburner.com/~ff/Mashable?i=0N_mvMwPHYk:j5Pmi_N-JQ8:_cyp7NeR2Rw [33] =&gt; [34] =&gt; http://feeds.feedburner.com/~r/Mashable/~4/0N_mvMwPHYk ) </code></pre> <p>is there a way i can filter out the correct url for image? for example.... i would like to strip out urls with no extensions of "jpg,png,gif" etc. secondly, i would like to scrap urls with eg bizspark, digg, facebook, tweet, twitter etc. anybody found any easier way of doing this? please help me out</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload