Note that there are some explanatory texts on larger screens.

plurals
  1. POWhy does adding local variables make .NET code slower
    primarykey
    data
    text
    <p>Why does commenting out the first two lines of this for loop and uncommenting the third result in a 42% speedup?</p> <pre><code>int count = 0; for (uint i = 0; i &lt; 1000000000; ++i) { var isMultipleOf16 = i % 16 == 0; count += isMultipleOf16 ? 1 : 0; //count += i % 16 == 0 ? 1 : 0; } </code></pre> <p>Behind the timing is vastly different assembly code: 13 vs. 7 instructions in the loop. The platform is Windows 7 running .NET 4.0 x64. Code optimization is enabled, and the test app was run outside VS2010. [<strong>Update:</strong> <a href="https://github.com/breyed/PerfTest" rel="nofollow noreferrer">Repro project</a>, useful for verifying project settings.]</p> <p>Eliminating the intermediate boolean is a fundamental optimization, one of the simplest in my 1980's era <a href="http://en.wikipedia.org/wiki/Compilers:_Principles,_Techniques,_and_Tools" rel="nofollow noreferrer">Dragon Book</a>. How did the optimization not get applied when generating the CIL or JITing the x64 machine code?</p> <p>Is there a "Really compiler, I would like you to optimize this code, please" switch? While I sympathize with the sentiment that premature optimization is akin to the <a href="http://www.biblegateway.com/passage/?search=1%20Timothy%206:10&amp;version=ESV" rel="nofollow noreferrer">love of money</a>, I could see the frustration in trying to profile a complex algorithm that had problems like this scattered throughout its routines. You'd work through the hotspots but have no hint of the broader warm region that could be vastly improved by hand tweaking what we normally take for granted from the compiler. I sure hope I'm missing something here.</p> <p><strong>Update:</strong> Speed differences also occur for x86, but depend on the order that methods are just-in-time compiled. See <a href="https://stackoverflow.com/questions/10406796/why-does-jit-order-affect-performance">Why does JIT order affect performance?</a></p> <p><strong>Assembly code</strong> (as requested):</p> <pre><code> var isMultipleOf16 = i % 16 == 0; 00000037 mov eax,edx 00000039 and eax,0Fh 0000003c xor ecx,ecx 0000003e test eax,eax 00000040 sete cl count += isMultipleOf16 ? 1 : 0; 00000043 movzx eax,cl 00000046 test eax,eax 00000048 jne 0000000000000050 0000004a xor eax,eax 0000004c jmp 0000000000000055 0000004e xchg ax,ax 00000050 mov eax,1 00000055 lea r8d,[rbx+rax] </code></pre> <pre><code> count += i % 16 == 0 ? 1 : 0; 00000037 mov eax,ecx 00000039 and eax,0Fh 0000003c je 0000000000000042 0000003e xor eax,eax 00000040 jmp 0000000000000047 00000042 mov eax,1 00000047 lea edx,[rbx+rax] </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload