Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>Well, the way you're timing things looks pretty nasty to me. It would be much more sensible to just time the whole loop:</p> <pre><code>var stopwatch = Stopwatch.StartNew(); for (int i = 1; i &lt; 100000000; i++) { Fibo(100); } stopwatch.Stop(); Console.WriteLine("Elapsed time: {0}", stopwatch.Elapsed); </code></pre> <p>That way you're not at the mercy of tiny timings, floating point arithmetic and accumulated error.</p> <p>Having made that change, see whether the "non-catch" version is still slower than the "catch" version.</p> <p>EDIT: Okay, I've tried it myself - and I'm seeing the same result. Very odd. I wondered whether the try/catch was disabling some bad inlining, but using <code>[MethodImpl(MethodImplOptions.NoInlining)]</code> instead didn't help...</p> <p>Basically you'll need to look at the optimized JITted code under cordbg, I suspect...</p> <p>EDIT: A few more bits of information:</p> <ul> <li>Putting the try/catch around just the <code>n++;</code> line still improves performance, but not by as much as putting it around the whole block</li> <li>If you catch a specific exception (<code>ArgumentException</code> in my tests) it's still fast</li> <li>If you print the exception in the catch block it's still fast</li> <li>If you rethrow the exception in the catch block it's slow again</li> <li>If you use a finally block instead of a catch block it's slow again</li> <li>If you use a finally block <em>as well as</em> a catch block, it's fast</li> </ul> <p>Weird...</p> <p>EDIT: Okay, we have disassembly...</p> <p>This is using the C# 2 compiler and .NET 2 (32-bit) CLR, disassembling with mdbg (as I don't have cordbg on my machine). I still see the same performance effects, even under the debugger. The fast version uses a <code>try</code> block around everything between the variable declarations and the return statement, with just a <code>catch{}</code> handler. Obviously the slow version is the same except without the try/catch. The calling code (i.e. Main) is the same in both cases, and has the same assembly representation (so it's not an inlining issue).</p> <p>Disassembled code for fast version:</p> <pre><code> [0000] push ebp [0001] mov ebp,esp [0003] push edi [0004] push esi [0005] push ebx [0006] sub esp,1Ch [0009] xor eax,eax [000b] mov dword ptr [ebp-20h],eax [000e] mov dword ptr [ebp-1Ch],eax [0011] mov dword ptr [ebp-18h],eax [0014] mov dword ptr [ebp-14h],eax [0017] xor eax,eax [0019] mov dword ptr [ebp-18h],eax *[001c] mov esi,1 [0021] xor edi,edi [0023] mov dword ptr [ebp-28h],1 [002a] mov dword ptr [ebp-24h],0 [0031] inc ecx [0032] mov ebx,2 [0037] cmp ecx,2 [003a] jle 00000024 [003c] mov eax,esi [003e] mov edx,edi [0040] mov esi,dword ptr [ebp-28h] [0043] mov edi,dword ptr [ebp-24h] [0046] add eax,dword ptr [ebp-28h] [0049] adc edx,dword ptr [ebp-24h] [004c] mov dword ptr [ebp-28h],eax [004f] mov dword ptr [ebp-24h],edx [0052] inc ebx [0053] cmp ebx,ecx [0055] jl FFFFFFE7 [0057] jmp 00000007 [0059] call 64571ACB [005e] mov eax,dword ptr [ebp-28h] [0061] mov edx,dword ptr [ebp-24h] [0064] lea esp,[ebp-0Ch] [0067] pop ebx [0068] pop esi [0069] pop edi [006a] pop ebp [006b] ret </code></pre> <p>Disassembled code for slow version:</p> <pre><code> [0000] push ebp [0001] mov ebp,esp [0003] push esi [0004] sub esp,18h *[0007] mov dword ptr [ebp-14h],1 [000e] mov dword ptr [ebp-10h],0 [0015] mov dword ptr [ebp-1Ch],1 [001c] mov dword ptr [ebp-18h],0 [0023] inc ecx [0024] mov esi,2 [0029] cmp ecx,2 [002c] jle 00000031 [002e] mov eax,dword ptr [ebp-14h] [0031] mov edx,dword ptr [ebp-10h] [0034] mov dword ptr [ebp-0Ch],eax [0037] mov dword ptr [ebp-8],edx [003a] mov eax,dword ptr [ebp-1Ch] [003d] mov edx,dword ptr [ebp-18h] [0040] mov dword ptr [ebp-14h],eax [0043] mov dword ptr [ebp-10h],edx [0046] mov eax,dword ptr [ebp-0Ch] [0049] mov edx,dword ptr [ebp-8] [004c] add eax,dword ptr [ebp-1Ch] [004f] adc edx,dword ptr [ebp-18h] [0052] mov dword ptr [ebp-1Ch],eax [0055] mov dword ptr [ebp-18h],edx [0058] inc esi [0059] cmp esi,ecx [005b] jl FFFFFFD3 [005d] mov eax,dword ptr [ebp-1Ch] [0060] mov edx,dword ptr [ebp-18h] [0063] lea esp,[ebp-4] [0066] pop esi [0067] pop ebp [0068] ret </code></pre> <p>In each case the <code>*</code> shows where the debugger entered in a simple "step-into".</p> <p>EDIT: Okay, I've now looked through the code and I think I can see how each version works... and I believe the slower version is slower because it uses fewer registers and more stack space. For small values of <code>n</code> that's possibly faster - but when the loop takes up the bulk of the time, it's slower.</p> <p>Possibly the try/catch block <em>forces</em> more registers to be saved and restored, so the JIT uses those for the loop as well... which happens to improve the performance overall. It's not clear whether it's a reasonable decision for the JIT to <em>not</em> use as many registers in the "normal" code.</p> <p>EDIT: Just tried this on my x64 machine. The x64 CLR is <em>much</em> faster (about 3-4 times faster) than the x86 CLR on this code, and under x64 the try/catch block doesn't make a noticeable difference.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload