StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
text
Body
copied!<p>Let me acknowledge this,</p> <p>Following code has been written using C# and not C. You can associate the idea what I am trying to articulate. And most of the content are from a Parallel Pattern (was a draft document by Microsoft on parallel approach)</p> <p>To do the best static partitioning possible, you need to be able to accurately predict ahead of time how long all the iterations will take. That’s rarely feasible, resulting in a need for a more dynamic partitioning, where the system can adapt to changing workloads quickly. We can address this by shifting to the other end of the partitioning tradeoffs spectrum, with as much load-balancing as possible.</p> <p>To do that, rather than pushing to each of the threads a given set of indices to process, we can have the threads compete for iterations. We employ a pool of the remaining iterations to be processed, which initially starts filled with all iterations. Until all of the iterations have been processed, each thread goes to the iteration pool, removes an iteration value, processes it, and then repeats. In this manner, we can achieve in a greedy fashion an approximation for the optimal level of load-balancing possible (the true optimum could only be achieved with a priori knowledge of exactly how long each iteration would take). If a thread gets stuck processing a particular long iteration, the other threads will compensate by processing work from the pool in the meantime. Of course, even with this scheme you can still find yourself with a far from optimal partitioning (which could occur if one thread happened to get stuck with several pieces of work significantly larger than the rest), but without knowledge of how much processing time a given piece of work will require, there’s little more that can be done.</p> <p>Here’s an example implementation that takes load-balancing to this extreme. The pool of iteration values is maintained as a single integer representing the next iteration available, and the threads involved in the processing “remove items” by atomically incrementing this integer:</p> <pre><code>public static void MyParallelFor( int inclusiveLowerBound, int exclusiveUpperBound, Action<int> body) { // Get the number of processors, initialize the number of remaining // threads, and set the starting point for the iteration. int numProcs = Environment.ProcessorCount; int remainingWorkItems = numProcs; int nextIteration = inclusiveLowerBound; using (ManualResetEvent mre = new ManualResetEvent(false)) { // Create each of the work items. for (int p = 0; p < numProcs; p++) { ThreadPool.QueueUserWorkItem(delegate { int index; while ((index = Interlocked.Increment( ref nextIteration) - 1) < exclusiveUpperBound) { body(index); } if (Interlocked.Decrement(ref remainingWorkItems) == 0) mre.Set(); }); } // Wait for all threads to complete. mre.WaitOne(); } } </code></pre>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload