Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>Your specific example is seldom solved with multi-threading. As many have said, this class of problems is IO-bound, meaning the processor has very little work to do, and spends most of it's time waiting for some data to arrive over the wire and to process that, and similarly it has to wait for disk buffers to flush so that it can put more of the recently downloaded data on disk. </p> <p>The method to performance is through the select() facility, or an equivalent system call. The basic process is to open a number of sockets (for the web crawler downloads) and file handles (for storing them to disk). Next you set all of the different sockets and fh to non-blocking mode, meaning that instead of making your program wait until data is available to read after issuing a request, it returns right away with a special code (usually EAGAIN) to indicate that no data is ready. If you looped through all of the sockets in this way you would be polling, which works well, but is still a waste of cpu resources because your reads and writes will almost always return with EAGAIN. </p> <p>To get around this, all of the sockets and fp's will be collected into a 'fd_set', which is passed to the select system call, then your program will block, waiting on ANY of the sockets, and will awaken your program when there's some data on any of the streams to process. </p> <hr> <p>The other common case, compute bound work, is without a doubt best addressed with some sort of true parallelism (as apposed to the asynchronous concurrency presented above) to access the resources of multiple cpu's. In the case that your cpu bound task is running on a single threaded archetecture, definately avoid any concurrency, as the overhead will actually slow your task down. </p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload