Note that there are some explanatory texts on larger screens.

plurals
  1. POHTTP request over TCP dropping data?
    text
    copied!<p>I am making a DownloadString function in order to retrieve HTML data (since the WebClient lacks quite a bit of speed =/)</p> <p>Here's what i have so far...</p> <pre><code> public static string DownloadString(string url) { TcpClient client = new TcpClient(); client.Client.ReceiveTimeout = 5; string dns = UrlToDNS(url); byte[] buffer = new byte[51200]; client.Client.Connect(dns, 80); string getVal = url.Substring(url.IndexOf(dns) + dns.Length); string HTTPHeader = "GET " + getVal + " HTTP/1.1\nHost: " + dns + "\nConnection: close\nUser-Agent: Pastebin API 0.1\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\nAccept-Charset: ISO-8859-1,UTF-8;q=0.7,*;q=0.7\nCache-Control: no-cache\nAccept-Language: en;q=0.7,en-us;q=0.3\n\n"; client.Client.Send(s2b(HTTPHeader)); client.Client.Receive(buffer); return b2s(buffer); } private static string b2s(byte[] ba) { string ret = ""; foreach (byte b in ba) ret += Convert.ToChar(b); return ret; } </code></pre> <p>(s2b not necessary since the http server returns OK)</p> <p>However, when i run the code (with <a href="http://www.google.com/" rel="nofollow">http://www.google.com/</a> as a test), it seems that some of the data is dropped/not read:</p> <pre><code>HTTP/1.1 200 OK Date: Sat, 20 Aug 2011 15:18:28 GMT Expires: -1 Cache-Control: private, max-age=0 Content-Type: text/html; charset=ISO-8859-1 Set-Cookie: PREF=ID=3714446c9ffb56bf:FF=0:TM=1313853508:LM=1313853508:S=mu1XpTcwqFTwgwJM; expires=Mon, 19-Aug-2013 15:18:28 GMT; path=/; domain=.google.com Set-Cookie: NID=50=B8YKlYj7eK84obqC5YO10AKF9jJNcQ5w4NkzidRL9of0Sc24EpbWeP-w7HVfm-eBCfE2NX2QMZAfEBpsqsgjhWqylFUIXU-bs6ObkLQbXJ59sa_daivfBLYJkQvq_WH; expires=Sun, 19-Feb-2012 15:18:2&gt;8 GMT; path=/; domain=.google.com; HttpOnly Server: gws X-XSS-Protection: 1; mode=block Connection: close &lt;!doctype html&gt;&lt;html&gt;&lt;head&gt;&lt;meta http-equiv="content-type" content="text/html; charset=ISO-8859-1"&gt;&lt;meta name="description" content="Search the world&amp;#39;s information, including webpages, images, videos and more. Google has many special features to help you find exactly what you&amp;#39;re looking for."&gt;&lt;meta name="robots" content="noodp"&gt;&lt;title&gt;Google&lt;/title&gt;&lt;script&gt;window.google={kEI:"RNBPTvPcI5C_gQeywpHfBg",getEI:function(a){var b;while(a&amp;&amp;!(a.getAttribute&amp;&amp;(b=a.getAttribute("eid"))))a=a.parentNode;return b||google.kEI},kEXPI:"28936,29049,29774,30465,30542,31760",kCSI:{e </code></pre> <p>To add another complication, it seems to drop a variable amount of data each time; I haven't gotten consistent results with how much data is lost, sometimes it loses only a small amount and sometimes (like the example) a larger amount</p> <p>Any ideas on what is causing it? (or a better method of retrieving the source code of a webpage without WebClient)</p> <p>(also ignore the fact that the input and output data hasn't been sanitized)</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload