Note that there are some explanatory texts on larger screens.

plurals
  1. POThe variable return null all the time
    primarykey
    data
    text
    <p>My program is a webcrawler. Im trying to download images from a website. In my webcrawler site i did:</p> <pre><code>try { HtmlAgilityPack.HtmlDocument doc = TimeOut.getHtmlDocumentWebClient(mainUrl, false, "", 0, "", ""); if (doc == null) { if (wccfg.downloadcontent == true) { retwebcontent.retrieveImages(mainUrl); } failed = true; wccfg.failedUrls++; failed = false; } </code></pre> <p>For example when doc is null the mainUrl contain:</p> <pre><code>http://members.tripod.com/~VanessaWest/bundybowman2.jpg </code></pre> <p>Now its jumping to the retrieveImages method in the other class:</p> <pre><code>namespace GatherLinks { class RetrieveWebContent { HtmlAgilityPack.HtmlDocument doc; string imgg; int images; public RetrieveWebContent() { images = 0; } public List&lt;string&gt; retrieveImages(string address) { try { doc = new HtmlAgilityPack.HtmlDocument(); System.Net.WebClient wc = new System.Net.WebClient(); List&lt;string&gt; imgList = new List&lt;string&gt;(); doc.Load(wc.OpenRead(address)); HtmlNodeCollection imgs = doc.DocumentNode.SelectNodes("//img[@src]"); if (imgs == null) return new List&lt;string&gt;(); foreach (HtmlNode img in imgs) { if (img.Attributes["src"] == null) continue; HtmlAttribute src = img.Attributes["src"]; imgList.Add(src.Value); if (src.Value.StartsWith("http") || src.Value.StartsWith("https") || src.Value.StartsWith("www")) { images++; string[] arr = src.Value.Split('/'); imgg = arr[arr.Length - 1]; //imgg = Path.GetFileName(new Uri(src.Value).LocalPath); //wc.DownloadFile(src.Value, @"d:\MyImages\" + imgg); wc.DownloadFile(src.Value, "d:\\MyImages\\" + Guid.NewGuid() + ".jpg"); } } return imgList; } catch { Logger.Write("There Was Problem Downloading The Image: " + imgg); return null; } } } } </code></pre> <p>Now im using a breakpoint and step line by line and after doing this line:</p> <pre><code>HtmlNodeCollection imgs = doc.DocumentNode.SelectNodes("//img[@src]"); </code></pre> <p>The variable imgs is null. Then on the next line that check if its null its jumping to the end and does nothing.</p> <p>How can i solve it so it will be able to download the image from <a href="http://members.tripod.com/~VanessaWest/bundybowman2.jpg" rel="nofollow">http://members.tripod.com/~VanessaWest/bundybowman2.jpg</a> ?</p> <p>EDIT**</p> <pre><code>public List&lt;string&gt; retrieveImages(string address) { try { doc = new HtmlAgilityPack.HtmlDocument(); System.Net.WebClient wc = new System.Net.WebClient(); List&lt;string&gt; imgList = new List&lt;string&gt;(); doc.Load(wc.OpenRead(address)); string t = doc.DocumentNode.InnerText; HtmlNodeCollection imgs = doc.DocumentNode.SelectNodes("//img//[@src]"); if (imgs == null) return new List&lt;string&gt;(); foreach (HtmlNode img in imgs) { if (img.Attributes["src"] == null) continue; HtmlAttribute src = img.Attributes["src"]; wc.DownloadFile(src.Value, "d:\\MyImages\\" + Guid.NewGuid() + ".jpg"); imgList.Add(src.Value); if (src.Value.StartsWith("http") || src.Value.StartsWith("https") || src.Value.StartsWith("www")) { images++; string[] arr = src.Value.Split('/'); imgg = arr[arr.Length - 1]; //imgg = Path.GetFileName(new Uri(src.Value).LocalPath); //wc.DownloadFile(src.Value, @"d:\MyImages\" + imgg); wc.DownloadFile(src.Value, "d:\\MyImages\\" + Guid.NewGuid() + ".jpg"); } } return imgList; } catch { Logger.Write("There Was Problem Downloading The Image: " + imgg); return null; } } </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload