Note that there are some explanatory texts on larger screens.

plurals
  1. PODistributed algorithm design
    text
    copied!<p>I've been reading Introduction to Algorithms and started to get a few ideas and questions popping up in my head. The one that's baffled me most is how you would approach designing an algorithm to schedule items/messages in a queue that is distributed.</p> <p>My thoughts have lead me to browsing Wikipedia on topics such as Sorting,Message queues,Sheduling, Distributed hashtables, to name a few.</p> <p><strong>The scenario:</strong> Say you wanted to have a system that queued messages (strings or some serialized object for example). A key feature of this system is to avoid any single point of failure. The system had to be distributed across multiple nodes within some cluster and had to consistently (or as best as possible) even the work load of each node within the cluster to avoid hotspots.</p> <p>You want to avoid the use of a master/slave design for replication and scaling (no single point of failure). The system totally avoids writing to disc and maintains in memory data structures. </p> <p>Since this is meant to be a queue of some sort the system should be able to use varying scheduling algorithms (FIFO,Earliest deadline,round robin etc...) to determine which message should be returned on the next request regardless of which node in the cluster the request is made to.</p> <p><strong>My initial thoughts</strong> I can imagine how this would work on a single machine but when I start thinking about how you'd distribute something like this questions like:</p> <p>How would I hash each message?</p> <p>How would I know which node a message was sent to?</p> <p>How would I schedule each item so that I can determine which message and from which node should be returned next?</p> <p>I started reading about distributed hash tables and how projects like Apache Cassandra use some sort of consistent hashing to distribute data but then I thought, since the query won't supply a key I need to know where the next item is and just supply it... This lead into reading about peer to peer protocols and how they approach the synchronization problem across nodes.</p> <p>So my question is, how would you approach a problem like the one described above, or is this too far fetched and is simply a stupid idea...? </p> <p>Just an overview, pointers,different approaches, pitfalls and benefits of each. The technologies/concepts/design/theory that may be appropriate. Basically anything that could be of use in understanding how something like this may work.</p> <p>And if you're wondering, no I'm not intending to implement anything like this, its just popped into my head while reading (It happens, I get distracted by wild ideas when I read a good book).</p> <p><strong>UPDATE</strong></p> <p>Another interesting point that would become an issue is <a href="http://wiki.apache.org/cassandra/DistributedDeletes" rel="nofollow">distributed deletes</a>.I know systems like Cassandra have tackled this by implementing <a href="http://wiki.apache.org/cassandra/HintedHandoff" rel="nofollow">HintedHandoff</a>,<a href="http://wiki.apache.org/cassandra/ReadRepair" rel="nofollow">Read Repair</a> and <a href="http://wiki.apache.org/cassandra/AntiEntropy" rel="nofollow">AntiEntropy</a> and it seems to work work well but are there any other (viable and efficient) means of tackling this?</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload