Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>I threw together a quick-and-dirty benchmark in C# using a prime generator as a test. The test generates primes up to a constant limit (I chose 500000) using a simple Sieve of Eratosthenes implementation and repeats the test 800 times, parallelized over a specific number of threads, either using the .NET <code>ThreadPool</code> or standalone threads.</p> <p>The test was run on a Quad-Core Q6600 running Windows Vista (x64). This is not using the Task Parallel Library, just simple threads. It was run for the following scenarios:</p> <ul> <li>Serial execution (no threading)</li> <li>4 threads (i.e. one per core), using the <code>ThreadPool</code></li> <li>40 threads using the <code>ThreadPool</code> (to test the efficiency of the pool itself)</li> <li>4 standalone threads</li> <li>40 standalone threads, to simulate context-switching pressure</li> </ul> <p>The results were:</p> <pre><code>Test | Threads | ThreadPool | Time -----+---------+------------+-------- 1 | 1 | False | 00:00:17.9508817 2 | 4 | True | 00:00:05.1382026 3 | 40 | True | 00:00:05.3699521 4 | 4 | False | 00:00:05.2591492 5 | 40 | False | 00:00:05.0976274 </code></pre> <p>Conclusions one can draw from this:</p> <ul> <li><p>Parallelization isn't perfect (as expected - it never is, no matter the environment), but splitting the load across 4 cores results in about 3.5x more throughput, which is hardly anything to complain about.</p></li> <li><p>There was negligible difference between 4 and 40 threads using the <code>ThreadPool</code>, which means that no significant expense is incurred with the pool, even when you bombard it with requests.</p></li> <li><p>There was negligible difference between the <code>ThreadPool</code> and free-threaded versions, which means that the <code>ThreadPool</code> does not have any significant "constant" expense;</p></li> <li><p>There was negligible difference between the 4-thread and 40-thread free-threaded versions, which means that .NET doesn't perform any worse than one would expect it to with heavy context-switching.</p></li> </ul> <p>Do we even need a C++ benchmark to compare to? The results are pretty clear: Threads in .NET are not slow. Unless <strong>you</strong>, the programmer, write poor multi-threading code and end up with resource starvation or lock convoys, you really don't need to worry.</p> <p>With .NET 4.0 and the TPL and improvements to the <code>ThreadPool</code>, work-stealing queues and all that cool stuff, you have even more leeway to write "questionable" code and still have it run efficiently. You don't get these features at all from C++.</p> <p>For reference, here is the test code:</p> <pre><code>using System; using System.Collections.Generic; using System.Diagnostics; using System.Runtime.CompilerServices; using System.Threading; namespace ThreadingTest { class Program { private static int PrimeMax = 500000; private static int TestRunCount = 800; static void Main(string[] args) { Console.WriteLine("Test | Threads | ThreadPool | Time"); Console.WriteLine("-----+---------+------------+--------"); RunTest(1, 1, false); RunTest(2, 4, true); RunTest(3, 40, true); RunTest(4, 4, false); RunTest(5, 40, false); Console.WriteLine("Done!"); Console.ReadLine(); } static void RunTest(int sequence, int threadCount, bool useThreadPool) { TimeSpan duration = Time(() =&gt; GeneratePrimes(threadCount, useThreadPool)); Console.WriteLine("{0} | {1} | {2} | {3}", sequence.ToString().PadRight(4), threadCount.ToString().PadRight(7), useThreadPool.ToString().PadRight(10), duration); } static TimeSpan Time(Action action) { Stopwatch sw = new Stopwatch(); sw.Start(); action(); sw.Stop(); return sw.Elapsed; } static void GeneratePrimes(int threadCount, bool useThreadPool) { if (threadCount == 1) { TestPrimes(TestRunCount); return; } int testsPerThread = TestRunCount / threadCount; int remaining = threadCount; using (ManualResetEvent finishedEvent = new ManualResetEvent(false)) { for (int i = 0; i &lt; threadCount; i++) { Action testAction = () =&gt; { TestPrimes(testsPerThread); if (Interlocked.Decrement(ref remaining) == 0) { finishedEvent.Set(); } }; if (useThreadPool) { ThreadPool.QueueUserWorkItem(s =&gt; testAction()); } else { ThreadStart ts = new ThreadStart(testAction); Thread th = new Thread(ts); th.Start(); } } finishedEvent.WaitOne(); } } [MethodImpl(MethodImplOptions.NoOptimization)] static void IteratePrimes(IEnumerable&lt;int&gt; primes) { int count = 0; foreach (int prime in primes) { count++; } } static void TestPrimes(int testRuns) { for (int t = 0; t &lt; testRuns; t++) { var primes = Primes.GenerateUpTo(PrimeMax); IteratePrimes(primes); } } } } </code></pre> <p>And here is the prime generator:</p> <pre><code>using System; using System.Collections.Generic; using System.Linq; namespace ThreadingTest { public class Primes { public static IEnumerable&lt;int&gt; GenerateUpTo(int maxValue) { if (maxValue &lt; 2) return Enumerable.Empty&lt;int&gt;(); bool[] primes = new bool[maxValue + 1]; for (int i = 2; i &lt;= maxValue; i++) primes[i] = true; for (int i = 2; i &lt; Math.Sqrt(maxValue + 1) + 1; i++) { if (primes[i]) { for (int j = i * i; j &lt;= maxValue; j += i) primes[j] = false; } } return Enumerable.Range(2, maxValue - 1).Where(i =&gt; primes[i]); } } } </code></pre> <p>If you see any obvious flaws in the test, let me know. Barring any serious problems with the test itself, I think the results speak for themselves, and the message is clear:</p> <p><strong>Don't listen to anyone who makes overly broad and unqualified statements about how the performance of .NET or any other language/environment is "bad" in some particular area, because they are probably talking out of their... rear ends.</strong></p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload