Note that there are some explanatory texts on larger screens.

plurals
  1. POUnexpected behaviour while using Httpwebrequest on a form to obtain a table for scrapping
    text
    copied!<p>I am trying to scrape a website written in php to extract some information from a particular table. Here is the scenario. </p> <p>On the landing page there is a form that can take queries from user and based on that search for the results. If I ignore those fields and click on "Submit" it will produce the whole result (Which is what I am interested in). Before I did not know about HTTPWebRequest class and I was simply passing the URL to Htmlweb.load(URL) method in HtmlAgilityPack library and obviously was not the way to go.</p> <p>Then I searched for HTTPWebRequest and I found an example which is like this</p> <pre><code> Dim cookies As New CookieContainer Dim postData As String = "postData obtained using live httpheaders pluging in firefox" Dim encoding As New UTF8Encoding Dim byteData As Byte() = encoding.GetBytes(postData) Dim postRequest As HttpWebRequest = DirectCast(WebRequest.Create("URL"), HttpWebRequest) postRequest.Method = "POST" postRequest.KeepAlive = True postRequest.CookieContainer = cookies postRequest.ContentType = "application/x-www-form-urlencoded" postRequest.ContentLength = byteData.Length postRequest.Referer = "Referer Page" postRequest.UserAgent = "Mozilla/5.0 (Windows; U; Windows NT 6.1; ru; rv:1.9.2.3) Gecko/20100401 Firefox/4.0 (.NET CLR 3.5.30729)" Dim postreqstream As Stream = postRequest.GetRequestStream() postreqstream.Write(byteData, 0, byteData.Length) postreqstream.Close() Dim postresponse As HttpWebResponse postresponse = DirectCast(postRequest.GetResponse(), HttpWebResponse) cookies.Add(postresponse.Cookies) Dim postreqreader As New StreamReader(postresponse.GetResponseStream()) Dim thepage As String = postreqreader.ReadToEnd </code></pre> <p>Now when I output thepage variable to a browser in vb form, I can see the page that I want (Containing tables). At this point I simply passed the URL of that page to htmlagilitypack like so</p> <pre><code> Dim web As New HtmlAgilityPack.HtmlWeb() Dim htmlDoc As HtmlAgilityPack.HtmlDocument = web.Load("URL") Dim tabletag As HtmlNodeCollection = htmlDoc.DocumentNode.SelectNodes("//table") Dim tablenode As HtmlNode = htmlDoc.DocumentNode.SelectSingleNode("//table[@summary='List of services']") If Not tabletag Is Nothing Then Console.WriteLine("YES") End If </code></pre> <p>But tabletag variable is nothing. I want to know where I am going wrong? Also is there anyway to get the URL straight from httpwebrespone so I can pass into web.load method ?</p> <p>thank you</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload