Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>With gcc-4.7 and compilation with <code>gcc -std=c99 -O2 -S -D_GNU_SOURCE -fverbose-asm tcache.c</code> you can see that the compiler is optimizing enough to remove the for loop (because <code>sum</code> is not used).</p> <p><sub>I had to improve your source code; some <code>#include</code>-s are missing, and <code>i</code> is not declared in the second function, so your example don't even compile as it is.</sub></p> <p>Make <code>sum</code> a global variable, or pass it somehow to the caller (perhaps with a global <code>int globalsum;</code> and putting <code>globalsum=sum;</code> after the loop).</p> <p>And I am not sure you are right to clear the array with a <code>memset</code>. I could imagine a clever-enough compiler understanding that you are summing all zeros.</p> <p>At last your code has extremely regular behavior with good locality: once in a while, a cache miss happens, the entire cache line is loaded and data is good enough for many iterations. Some clever optimizations (e.g. <code>-O3</code> or better) might generate the good <code>prefetch</code> instructions. This is optimal for caches, because for a 32 words L1 cache line the cache miss happens every 32 loops so is well amortized. </p> <p>Making a linked list of data will make cache behavior be worse. Conversely, in some real programs carefully adding a <a href="http://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html" rel="nofollow">__builtin_prefetch</a> at few well chosen places may improve performance by more than 10% (but adding too many of them will <em>decrease</em> performance).</p> <p>In real life, the processor is spending the majority of the time to wait for some cache (and it is difficult to measure that; this waiting is CPU time, not idle time). Remember that during an L3 cache miss, the time needed to load data from your RAM module is the time needed to execute <em>hundreds</em> of machine instructions!</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload