StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POArchitecture of processing incoming requests in a service
primarykey
Id
8002442
data
AcceptedAnswerId
8002572
AnswerCount
3
ClosedDate
CommentCount
1
CommunityOwnedDate
CreationDate
2011-11-03T22:00:21.080
FavoriteCount
2
LastActivityDate
2011-11-07T16:15:40.467
LastEditDate
2011-11-07T16:15:40.467
LastEditorUserId
978756
OwnerUserId
978756
ParentId
0
PostTypeId
1
Score
3
ViewCount
355
LastEditorDisplayName
text
Body
I'm designing a server daemon for a project that has to take a large number of simultaneous requests and process them asynchronously. I'm aware of the sheer scale of such a project, but I'm serious about it and am trying to make a clear design and plan before going further. Here's a list of my goals: <ul> <li>Scalability - Must be able to parallelize the architecture onto multiple processors or even multiple servers.</li> <li>Ability to cope with a huge number of parallel connections.</li> <li>Must not cause blocking problems if a single request takes a long time to process.</li> <li>Request to response turnaround time must be minimal.</li> <li>Built around the .NET framework (will be writing this in C#)</li> </ul> My proposed architecture and flow is rather complicated, so here's a chart of my initial design: <img src="https://i.stack.imgur.com/pxIqp.png" alt="Architecture Flow Chart"> (and <a href="http://i39.tinypic.com/2lwm8oh.png" rel="nofollow noreferrer">here it is on tinypic</a> in case it resizes badly) The idea is that requests come in via the network (though I've not decided if TCP or UDP would be best yet) and are passed immediately to a high-speed load balancer. The load balancer then selects a request queue (RQ) to place the request, using a weighted random number generator. The weights are derived from the size of each queue. The reason for using a weighted RNG, rather than just placing the requests into the least busy queue, is that it prevents an empty but blocked queue (due to a hung request) from locking up the whole server. If all RQs exceed a certain size, the load balancer drops the request and places a "server too busy" response into the output queue (OPQ) - this part isn't shown in the diagram. Each queue corresponds to a thread whose affinity is set to one CPU core on the server. These threads are part of the parallel request processor, which consumes requests from each queue. The requests are categorized into one of three types: <ol> <li>Immediate - Immediate requests are, as the name suggests, processed immediately.</li> <li>Deferrable - Deferrable requests are considered to be low priority. They are processed immediately during low load, or placed into the deferred request queue (DRQ) if load is high. The load balancer fetches these deferred requests from the DRQ, marks them as immediate, then places them back into appropriate RQs.</li> <li>Timed - Timed requests are placed into the timed request queue (TRQ) along with their target timestamp. These requests are often generated as a result of another request, rather than being explicitly sent in by a client. When the request timestamp is exceeded, the next available request processor thread consumes it and processes it.</li> </ol> When a request is processed, data may be fetched from a key/value pair cache in memory, a key/value pair cache or on disk, or from a dedicated SQL database server. The values cached will be BSON, and the index will be a string. I'm thinking of using <code>Dictionary<T1,T2></code> to implement this in memory, and a btree (or similar) for the disk cache. The response is created when processing is complete, and it is placed into the output queue (OPQ). A loop then consumes responses from the OPQ and transmits them back to the client over the network. If the OPQ reaches 80% of its maximum size, one quarter of the request processor threads are halted. If the OPQ reaches 90% of its maximum size, half of the request processor threads are halted. If the OPQ reaches its maximum size, all request processor threads are halted. This will be achieved with a semaphore, which should also prevent individual request processor threads from getting blocked and leaving stale requests. What I'm looking for are suggestions on a few areas: <ul> <li>Are there any major flaws to this architecture that I missed?</li> <li>Is there anything I should consider changing for performance reasons?</li> <li>Would TCP or UDP be more appropriate for requests? It'd be very useful to have the "proof of delivery" that TCP offers, but the lightweight nature of UDP is appealing too.</li> <li>Are there any special considerations I need to think about when dealing with 100k+ simultaneous connections on a Windows server? I know Linux's TCP stack deals well, but I'm not so sure with Windows.</li> <li>Are there any other questions that I should be asking? Have I forgotten to consider anything?</li> </ul> I know this was a lot to read, and is probably quite a lot to ask too, so thank you for your time. Updated version of the diagram <a href="http://i43.tinypic.com/w6t7r4.png" rel="nofollow noreferrer">here</a>.
Tags
<.net><networking><architecture><scalability><parallel-processing>
Title
Architecture of processing incoming requests in a service
singulars
PostAcceptedAnswerId
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
PostParentId
1. This table or related slice is empty.
PostTypePostTypeId
1. PTQuestion
UserLastEditorUserId
1. USPolynomial
UserOwnerUserId
1. USPolynomial
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
2. PO
 singulars
 PostTypePostTypeId
 PTAnswer
3. PO
 singulars
 PostTypePostTypeId
 PTAnswer
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 POArchitecture of processing incoming requests in a service
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
2. VO
 singulars
 PostPostId
 POArchitecture of processing incoming requests in a service
 UserUserId
 USmikalai
 VoteTypeVoteTypeId
 VTFavorite
3. VO
 singulars
 PostPostId
 POArchitecture of processing incoming requests in a service
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
CommentsPostId
1. COHow did this project go/how is it going? Any blog posts on it? I'm very interested to hear what you have learnt along the way and what conclusions you came to.
 singulars
 PostPostId
 POArchitecture of processing incoming requests in a service
 UserUserId
 USTyson

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.