Note that there are some explanatory texts on larger screens.

plurals
  1. POHow to implement an IIS-like threadpool on a worker-server
    text
    copied!<p><strong>EDIT</strong> I realised my question was not stated clearly enough and have edited it heavily.<br> This is a bit of an open ended question so apologies in advance.</p> <p><em>In a nutshell, I want to implement IIS-style asynchronous request processing in an Azure worker role</em>. </p> <p>It may be very simple or it may be insanely hard - I am looking for pointers to where to research.</p> <p>While my implementation will use Azure Workers and Service Bus Queues, the general principle is applicable to any scenario where a worker process is listening for incoming requests and then servicing them.</p> <h2>What IIS does</h2> <p>In IIS there is a fixed-size threadpool. If you deal with all request synchronously then the maximum number of requests you can deal with in parallel == maxthreads. However, if you have to do slow external I/O to serve requests then this is highly inefficient because you can end up with the server being idle, yet have all threads tied up waiting for external I/O to complete. </p> <p>From <a href="http://msdn.microsoft.com/en-us/library/ee728598.aspx" rel="nofollow">MSDN</a>:</p> <blockquote> <p>On the Web server, the .NET Framework maintains a pool of threads that are used to service ASP.NET requests. When a request arrives, a thread from the pool is dispatched to process that request. If the request is processed synchronously, the thread that processes the request is blocked while the request is being processed, and that thread cannot service another request.</p> <p>This might not be a problem, because the thread pool can be made large enough to accommodate many blocked threads. However, the number of threads in the thread pool is limited. In large applications that process multiple simultaneous long-running requests, all available threads might be blocked. This condition is known as thread starvation. When this condition is reached, the Web server queues requests. If the request queue becomes full, the Web server rejects requests with an HTTP 503 status (Server Too Busy).</p> </blockquote> <p>In order to overcome this issue, IIS has some clever logic that allows you to deal with requests asynchronously: </p> <blockquote> <p>When an asynchronous action is invoked, the following steps occur:</p> <ol> <li><p>The Web server gets a thread from the thread pool (the worker thread) and schedules it to handle an incoming request. This worker thread initiates an asynchronous operation.</p></li> <li><p>The worker thread is returned to the thread pool to service another Web request.</p></li> <li><p>When the asynchronous operation is complete, it notifies ASP.NET.</p></li> <li><p>The Web server gets a worker thread from the thread pool (which might be a different thread from the thread that started the asynchronous operation) to process the remainder of the request, including rendering the response.</p></li> </ol> </blockquote> <p>The important point here is when the asynchronous request returns, the return action is scheduled to run on one of the <em>same</em> pool of threads that serves the initial incoming requests. This means that the system is limiting how much work it is doing concurrently and this is what I would like to replicate.</p> <h2>What I want to do</h2> <p>I want to create a Worker role which will listen for incoming work requests on Azure Service Bus Queues and also potentially on TCP sockets. Like IIS I want to have a maxium threadpool size and I want to limit how much <em>actual work</em> the worker is doing in parallel; If the worker is busy serving existing requests - whether new incoming ones or the callbacks from previous async calls - I don't want to pick up any new incoming requests until some threads have been freed up.</p> <p>It is not a problem to limit how many jobs I <em>start</em> concurrently - that is easy to control; It is limiting how many I am <em>actually working on</em> concurrently. </p> <p>Let's assume a threadpool of 100 threads. </p> <ul> <li><p>I get 100 requests to send an email come in and each email takes 5 seconds to send to the SMTP server. If I limit my server to only process 100 requests at the same time then my server will be unable to do anything else for 5 seconds, while the CPU is completely idle. So, I don't really mind starting to send 1,000 or 10,000 emails at the same time, because 99% of the "request process time" will be spent waiting for external I/O and my server will still be very quiet. So, that particular scenario I could deal with by just keeping on accepting incoming requests with no limit (or only limit the <em>start</em> of the request until I fire off the async call; as soon as the BeginSend is called, I'll return and start serving another request).</p></li> <li><p>Now, imagine instead that I have a type of request that goes to the database to read some data, does some heavy calculation on it and then writes that back to the database. There are two database requests there that should be made asynchronous but 90% of the request processing time will be spent on my worker. So, if I follow the same logic as above and keep start async calls and just letting the return do whatever it needs to get a thread to continue on then I will end up with a server that is very overloaded. </p></li> </ul> <p><em>Somehow, what IIS does is make sure that when an async call returns it uses the same fixed-size thread pool. This means that if I fire off a lot of async calls and they then return and start using my threads, IIS will not accept new requests until those returns have finished. And that is perfect because it ensures a sensible load on the server, especially when I have multiple load-balanced servers and a queue system that the servers pick work from.</em></p> <p>I have this sneaky suspicion that this might be very simple to do, there is just something basic I am missing. Or maybe it is insanely hard.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload