StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
primarykey
Id
7735362
data
AcceptedAnswerId
0
AnswerCount
0
ClosedDate
CommentCount
11
CommunityOwnedDate
CreationDate
2011-10-12T04:44:14.717
FavoriteCount
0
LastActivityDate
2011-10-13T05:09:35.443
LastEditDate
2011-10-13T05:09:35.443
LastEditorUserId
6210
OwnerUserId
6210
ParentId
7734693
PostTypeId
2
Score
7
ViewCount
0
LastEditorDisplayName
text
Body
EDIT: Why is a block oriented approach faster? We are taking advantage of the CPU's data cache by ensuring that whether we iterate over a block by row or by column, we guarantee that the entire block fits into the cache. For example, if you have a cache line of 32-bytes and an <code>int</code> is 4 bytes, you can fit a 8x8 <code>int</code> matrix into 8 cache lines. Assuming you have a big enough data cache, you can iterate over that matrix either by row or by column and be guaranteed that you do not thrash the cache. Another way to think about it is if your matrix fits in the cache, you can traverse it any way you want. If you have a matrix that is much bigger, say 512x512, then you need to tune your matrix traversal such that you don't thrash the cache. For example, if you traverse the matrix in the opposite order of the layout of the matrix, you will almost always miss the cache on every element you visit. A block oriented approach ensures that you only have a cache miss for data you will eventually visit before the CPU has to flush that cache line. In other words, a block oriented approach tuned to the cache line size will ensure you don't thrash the cache. So, if you are trying to optimize for the cache line size of the machine you are running on, you can iterate over the matrix in block form and ensure you only visit each matrix element once: <pre><code>int sum_diagonal_difference(int array[512][512], int block_size) { int i,j, block_i, block_j,result=0; // sum diagonal blocks for (block_i= 0; block_i<512; block_i+= block_size) for (block_j= block_i + block_size; block_j<512; block_j+= block_size) for(i=0; i<block_size; i++) for(j=0; j<block_size; j++) result+=abs(array[block_i + i][block_j + j]-array[block_j + j][block_i + i]); result+= result; // sum diagonal for (int block_offset= 0; block_offset<512; block_offset+= block_size) { for (i= 0; i<block_size; ++i) { for (j= i+1; j<block_size; ++j) { int value= abs(array[block_offset + i][block_offset + j]-array[block_offset + j][block_offset + i]); result+= value + value; } } } return result; } </code></pre> You should experiment with various values for <code>block_size</code>. On my machine, <code>8</code> lead to the biggest speed up (2.5x) compared to a <code>block_size</code> of 1 (and ~5x compared to the original iteration over the entire matrix). The <code>block_size</code> should ideally be <code>cache_line_size_in_bytes/sizeof(int)</code>.
Tags
Title
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. POImprove C function performance with cache locality?
 singulars
 PostTypePostTypeId
 PTQuestion
PostTypePostTypeId
1. PTAnswer
UserLastEditorUserId
1. USMSN
UserOwnerUserId
1. USMSN
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. POImprove C function performance with cache locality?
 singulars
 PostTypePostTypeId
 PTQuestion
PostsParentIdCreationDate
1. This table or related slice is empty.
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
2. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
3. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTAcceptedByOriginator
CommentsPostId

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.