Note that there are some explanatory texts on larger screens.

plurals
  1. POJava Apache HttpClient EnityUtils block
    primarykey
    data
    text
    <p>I am currently working on a project which is using Apache HttpClient 4.1.2 and it retrieves some data from a website.</p> <p>What the application does: it goes to a webpage and then goes to the next (found) pages until it reaches the end (e.g.: go to page 1 -> finds 20 more pages -> go to every next 20 pages). The problem is that it gets stuck on retrieving some random pages and it doesn't continue the crawl.</p> <p>Here is some code:</p> <pre><code>DefaultHttpClient mainHttp; HttpPost post; HttpResponse response; HttpEntity entity; String s; int curPage = 1; int index = 0; boolean ok = true; ... while (ok) { response = mainHttp.execute(post); entity = response.getEntity(); if (entity != null) { System.out.println("Enter " + curPage); s = EntityUtils.toString(entity); System.out.println("Exit " + curPage); index = s.indexOf("[" + curPage + "]"); if (index &gt; 0) { parseContent(); } else { ok = false; } } } </code></pre> <p>On the debug window is shows something like this: </p> <pre><code>Enter 1 Exit 1 . . . Enter n </code></pre> <p>I am also using a http request analyzer and I saw that on the page that stucks, the data is not retrieved completely (it doesn't reach the <code>&lt;/html&gt;</code> or the end of the page).</p> <p>What can I do to skip or retry downloading the data in such cases? Can anyone help me?</p> <p>Thank you! </p> <p><strong>LE:</strong></p> <p>The actual settings were:</p> <pre><code>mainHttp.setHttpRequestRetryHandler(new DefaultHttpRequestRetryHandler(1, true)); mainHttp.getParams().setParameter("http.connection-manager.timeout", 15000); mainHttp.getParams().setParameter("http.socket.timeout", 15000); mainHttp.getParams().setParameter("http.connection.timeout", 15000); </code></pre> <p>where <code>15000</code> is the timeout in miliseconds.</p> <p>Thank you for your help.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload