Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>In fact, there is no defined assumption to achieve what you want in a generic way.</p> <p>At first you have to bear in mind that the websites are different and can change at any moment so any try to get an indefectible algorithm is a waste of time in most of situations.</p> <p>In this case, if you have just a few number of websites to parse, then you can just figure out the current content disposition pattern of each one and parse it with HTML Agility Pack, for example:</p> <p><strong>24matins</strong>: There's a div with a class named "post-header", which first <code>&lt;img&gt;</code> is the main article image, then with HAP you could write:</p> <pre><code>var web = new HtmlWeb(); var doc = web.Load("http://www.24matins.fr/the-walking-dead-saison-4-le-deces-de-ce-personnage-ne-sera-pas-anodin-40685"); var img = doc.DocumentNode.SelectSingleNode("//div[@class='post-header']/img"); Console.WriteLine(img.Attributes["src"].Value); </code></pre> <p><strong>lasemaine</strong>..: There is a unique div with its class named "illustrations", so:</p> <pre><code>web = new HtmlWeb(); doc = web.Load("http://www.lasemainedansleboulonnais.fr/actualite/la_une/2013/04/04/article__20_ans_prison_meurtre_de_sa_mere_boulogne.shtml"); img = doc.DocumentNode.SelectSingleNode("//div[@class='illustrations']/img"); Console.WriteLine(img.Attributes["src"].Value); </code></pre> <p>Also, I would suggest you to use the RSS Feed of the sites to get relevant information. Generally, they include the picture of the articles and are more likely to have recognizable pattern as you can check out in <a href="http://www.24matins.fr/feed/rss-toutes-actualites" rel="nofollow">www.24matins.fr/feed/rss-toutes-actualites</a>.</p> <p>Hope it helps.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload