StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POWhy the performance difference between C# (quite a bit slower) and Win32/C?
primarykey
Id
1060049
data
AcceptedAnswerId
0
AnswerCount
7
ClosedDate
CommentCount
5
CommunityOwnedDate
CreationDate
2009-06-29T19:29:00.773
FavoriteCount
3
LastActivityDate
2017-01-16T20:38:40.997
LastEditDate
2009-06-29T20:07:04.490
LastEditorUserId
6932
OwnerUserId
130631
ParentId
0
PostTypeId
1
Score
16
ViewCount
1868
LastEditorDisplayName
text
Body
We are looking to migrate a performance critical application to .Net and find that the c# version is 30% to 100% slower than the Win32/C depending on the processor (difference more marked on mobile T7200 processor). I have a very simple sample of code that demonstrates this. For brevity I shall just show the C version - the c# is a direct translation: <pre><code>#include "stdafx.h" #include "Windows.h" int array1[100000]; int array2[100000]; int Test(); int main(int argc, char* argv[]) { int res = Test(); return 0; } int Test() { int calc,i,k; calc = 0; for (i = 0; i < 50000; i++) array1[i] = i + 2; for (i = 0; i < 50000; i++) array2[i] = 2 * i - 2; for (i = 0; i < 50000; i++) { for (k = 0; k < 50000; k++) { if (array1[i] == array2[k]) calc = calc - array2[i] + array1[k]; else calc = calc + array1[i] - array2[k]; } } return calc; } </code></pre> If we look at the disassembly in Win32 for the 'else' we have: <pre><code>35: else calc = calc + array1[i] - array2[k]; 004011A0 jmp Test+0FCh (004011bc) 004011A2 mov eax,dword ptr [ebp-8] 004011A5 mov ecx,dword ptr [ebp-4] 004011A8 add ecx,dword ptr [eax*4+48DA70h] 004011AF mov edx,dword ptr [ebp-0Ch] 004011B2 sub ecx,dword ptr [edx*4+42BFF0h] 004011B9 mov dword ptr [ebp-4],ecx </code></pre> (this is in debug but bear with me) The disassembly for the optimised c# version using the CLR debugger on the optimised exe: <pre><code> else calc = calc + pev_tmp[i] - gat_tmp[k]; 000000a7 mov eax,dword ptr [ebp-4] 000000aa mov edx,dword ptr [ebp-8] 000000ad mov ecx,dword ptr [ebp-10h] 000000b0 mov ecx,dword ptr [ecx] 000000b2 cmp edx,dword ptr [ecx+4] 000000b5 jb 000000BC 000000b7 call 792BC16C 000000bc add eax,dword ptr [ecx+edx*4+8] 000000c0 mov edx,dword ptr [ebp-0Ch] 000000c3 mov ecx,dword ptr [ebp-14h] 000000c6 mov ecx,dword ptr [ecx] 000000c8 cmp edx,dword ptr [ecx+4] 000000cb jb 000000D2 000000cd call 792BC16C 000000d2 sub eax,dword ptr [ecx+edx*4+8] 000000d6 mov dword ptr [ebp-4],eax </code></pre> Many more instructions, presumably the cause of the performance difference. So 3 questions really: <ol> <li>Am I looking at the correct disassembly for the 2 programs or are the tools misleading me?</li> <li>If the difference in the number of generated instructions is not the cause of the difference what is? </li> <li>What can we possibly do about it other than keep all our performance critical code in a native DLL.</li> </ol> Thanks in advance Steve PS I did receive an invite recently to a joint MS/Intel seminar entitled something like 'Building performance critical native applications' Hmm...
Tags
<c#><.net><c><performance><winapi>
Title
Why the performance difference between C# (quite a bit slower) and Win32/C?
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. This table or related slice is empty.
PostTypePostTypeId
1. PTQuestion
UserLastEditorUserId
1. USGreg D
UserOwnerUserId
1. USSteve
plurals
PostLinksPostIdRelatedPostId
1. PL
 singulars
 LinkTypeLinkTypeId
 LTLinked
2. PL
 singulars
 LinkTypeLinkTypeId
 LTLinked
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
2. PO
 singulars
 PostTypePostTypeId
 PTAnswer
3. PO
 singulars
 PostTypePostTypeId
 PTAnswer
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 POWhy the performance difference between C# (quite a bit slower) and Win32/C?
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
2. VO
 singulars
 PostPostId
 POWhy the performance difference between C# (quite a bit slower) and Win32/C?
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
3. VO
 singulars
 PostPostId
 POWhy the performance difference between C# (quite a bit slower) and Win32/C?
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
CommentsPostId

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.