StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
primarykey
Id
9283922
data
AcceptedAnswerId
0
AnswerCount
0
ClosedDate
CommentCount
10
CommunityOwnedDate
CreationDate
2012-02-14T20:45:08.077
FavoriteCount
0
LastActivityDate
2012-02-14T21:23:09.073
LastEditDate
2012-02-14T21:23:09.073
LastEditorUserId
922184
OwnerUserId
922184
ParentId
9283717
PostTypeId
2
Score
19
ViewCount
0
LastEditorDisplayName
text
Body
Short Answer: It's a compiler hiccup. x64 optimizer fail. <hr> Long Answer: This x86 version is very slow if SSE2 is disabled. But I'm able to reproduce the results with SSE2 enabled in x86. If you dive into the assembly of that inner-most loop. The x64 version has two extra memory copies at the end. x86: <pre><code>$LL71@main: movsd xmm2, QWORD PTR [eax-8] movsd xmm0, QWORD PTR [eax-16] movsd xmm3, QWORD PTR [eax] movapd xmm1, xmm0 mulsd xmm0, QWORD PTR __real@3fa60418a0000000 movapd xmm7, xmm2 mulsd xmm2, QWORD PTR __real@3f95810620000000 mulsd xmm7, xmm5 mulsd xmm1, xmm4 addsd xmm1, xmm7 movapd xmm7, xmm3 mulsd xmm3, QWORD PTR __real@3fdcccccc0000000 mulsd xmm7, xmm6 add eax, 24 ; 00000018H addsd xmm1, xmm7 addsd xmm0, xmm2 movq QWORD PTR [ecx], xmm1 addsd xmm0, xmm3 movq QWORD PTR [ecx+8], xmm0 lea edx, DWORD PTR [eax-16] add ecx, 16 ; 00000010H cmp edx, esi jne SHORT $LL71@main </code></pre> x64: <pre><code>$LL175@main: movsdx xmm3, QWORD PTR [rdx-8] movsdx xmm5, QWORD PTR [rdx-16] movsdx xmm4, QWORD PTR [rdx] movapd xmm2, xmm3 mulsd xmm2, xmm6 movapd xmm0, xmm5 mulsd xmm0, xmm7 addsd xmm2, xmm0 movapd xmm1, xmm4 mulsd xmm1, xmm8 addsd xmm2, xmm1 movsdx QWORD PTR r$109492[rsp], xmm2 mulsd xmm5, xmm9 mulsd xmm3, xmm10 addsd xmm5, xmm3 mulsd xmm4, xmm11 addsd xmm5, xmm4 movsdx QWORD PTR r$109492[rsp+8], xmm5 mov rcx, QWORD PTR r$109492[rsp] mov QWORD PTR [rax], rcx mov rcx, QWORD PTR r$109492[rsp+8] mov QWORD PTR [rax+8], rcx add rax, 16 add rdx, 24 lea rcx, QWORD PTR [rdx-16] cmp rcx, rbx jne SHORT $LL175@main </code></pre> The x64 version has a lot more (unexplained) moves at the end of the loop. It looks like some sort of memory-to-memory data-copy. <h1>EDIT:</h1> It turns out that the x64 optimizer isn't able to optimize out the following copy: <pre><code>(*i2) = r; </code></pre> This is why the inner loop has two extra memory copies. If you change the loop to this: <pre><code>std::for_each(m.begin(), m.end(), [&](const Vector& v) { i2->x = Dot(axisX, v); i2->y = Dot(axisY, v); ++i2; }); </code></pre> This eliminates the copies. Now the x64 version is just as fast as the x86 version: <pre><code>x86: 0.0249423 x64: 0.0249348 </code></pre> Lesson Learned: Compilers aren't perfect.
Tags
Title
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. POWhy c++ program compiled for x64 platform is slower than compiled for x86?
 singulars
 PostTypePostTypeId
 PTQuestion
PostTypePostTypeId
1. PTAnswer
UserLastEditorUserId
1. USMysticial
UserOwnerUserId
1. USMysticial
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. POWhy c++ program compiled for x64 platform is slower than compiled for x86?
 singulars
 PostTypePostTypeId
 PTQuestion
PostsParentIdCreationDate
1. This table or related slice is empty.
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
2. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
3. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
CommentsPostId

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.