Note that there are some explanatory texts on larger screens.

plurals
  1. POHttpWebRequest Timeouts After Ten Consecutive Requests
    text
    copied!<p>I'm writing a web crawler for a specific site. The application is a VB.Net Windows Forms application that is <em>not</em> using multiple threads - each web request is consecutive. However, after ten successful page retrievals every successive request times out.</p> <p>I have reviewed the similar questions already posted here on SO, and have implemented the recommended techniques into my GetPage routine, shown below:</p> <pre><code>Public Function GetPage(ByVal url As String) As String Dim result As String = String.Empty Dim uri As New Uri(url) Dim sp As ServicePoint = ServicePointManager.FindServicePoint(uri) sp.ConnectionLimit = 100 Dim request As HttpWebRequest = WebRequest.Create(uri) request.KeepAlive = False request.Timeout = 15000 Try Using response As HttpWebResponse = DirectCast(request.GetResponse, HttpWebResponse) Using dataStream As Stream = response.GetResponseStream() Using reader As New StreamReader(dataStream) If response.StatusCode &lt;&gt; HttpStatusCode.OK Then Throw New Exception("Got response status code: " + response.StatusCode) End If result = reader.ReadToEnd() End Using End Using response.Close() End Using Catch ex As Exception Dim msg As String = "Error reading page """ &amp; url &amp; """. " &amp; ex.Message Logger.LogMessage(msg, LogOutputLevel.Diagnostics) End Try Return result End Function </code></pre> <p>Have I missed something? Am I not closing or disposing of an object that should be? It seems strange that it always happens after ten consecutive requests.</p> <p>Notes: </p> <ol> <li><p>In the constructor for the class in which this method resides I have the following:</p> <p>ServicePointManager.DefaultConnectionLimit = 100</p></li> <li><p>If I set KeepAlive to true, the timeouts begin after five requests.</p></li> <li><p>All the requests are for pages in the same domain.</p></li> </ol> <p><strong>EDIT</strong></p> <p>I added a delay between each web request of between two and seven seconds so that I do not appear to be "hammering" the site or attempting a DOS attack. However, the problem still occurs.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload