StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
primarykey
Id
11904057
data
AcceptedAnswerId
0
AnswerCount
0
ClosedDate
CommentCount
3
CommunityOwnedDate
CreationDate
2012-08-10T14:45:03.087
FavoriteCount
0
LastActivityDate
2012-08-10T14:45:03.087
LastEditDate
LastEditorUserId
0
OwnerUserId
97337
ParentId
11902468
PostTypeId
2
Score
3
ViewCount
0
LastEditorDisplayName
text
Body
The first thing I would suspect is alignment. You may want to experiment with: <pre><code>__attribute__ ((align (16))) float ...[maxsize]; </code></pre> Or make sure that <code>maxsize</code> is a multiple of 16. That could definitely cause a 10% hit if in one configuration you're aligned and in another you're not. Vector operations can be extremely sensitive to this. The next major issue you may have is a huge stack (assuming <code>maxsize</code> is fairly large). ARM can deal with numbers less than 4k much more efficiently than it can deal with numbers larger than 4k (because it can only deal with 12-bit immediate values). So depending on the how the compiler has optimized it, pushing amparray way down on the stack could lead to more complicated math to access it. When small twiddly things lead to big performance changes, I always recommend pulling up the assembly (Product>Generate Output>Assembly) and seeing what's changes in the compiler output. I also highly recommend <a href="http://www.coranac.com/tonc/text/asm.htm" rel="nofollow">Whirlwind Tour of ARM Assembly</a> to get you started understanding what you're looking at. (Make sure you set the output to "For Archiving" so you see the optimized result.) You should also do a few more things: <ul> <li>Try rewriting this routine as simple C instead of using Accelerate. Yes, I know Accelerate is always faster, except it's not. All those function calls are quite expensive, and the compiler can often better vectorize simple multiplication and addition that Accelerate can in my experience. This is particularly true if your stride is 1, your vectors are not enormous, and you're on a 1-2 core device like an iPad. The moment you have code that handles a stride (if you don't need a stride), it's more complicated (slower) than the code you would have written by hand. In my experience, Accelerate does seem to be very good at ramps and transcendentals (cosines of big tables for example), but not nearly so good at simple vector and matrix math.</li> <li>If this code really matters to you, I've found that hand-writing the assembly can definitely out-pace the compiler. I'm not even that good at ARM assembler, and I've been able to beat the compiler by 2x on simple matrix math (and the compiler crushed Accelerate). I'm particularly talking about your loop here that seems to be doing just adds and multiplies. Handwriting the assembly is a pain of course, and you then have to maintain a C version for the assembler, but when it really matters it's really fast.</li> </ul>
Tags
Title
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. POWhy does order of array declaration affect performance so much?
 singulars
 PostTypePostTypeId
 PTQuestion
PostTypePostTypeId
1. PTAnswer
UserLastEditorUserId
1. This table or related slice is empty.
UserOwnerUserId
1. USRob Napier
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. POWhy does order of array declaration affect performance so much?
 singulars
 PostTypePostTypeId
 PTQuestion
PostsParentIdCreationDate
1. This table or related slice is empty.
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTAcceptedByOriginator
2. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
3. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
CommentsPostId

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.