Note that there are some explanatory texts on larger screens.

plurals
  1. POWrite a program to get CPU cache sizes and levels
    primarykey
    data
    text
    <p>I want to write a program to get my cache size(L1, L2, L3). I know the general idea of it.</p> <ol> <li>Allocate a big array</li> <li>Access part of it of different size each time.</li> </ol> <p>So I wrote a little program. Here's my code:</p> <pre><code>#include &lt;cstdio&gt; #include &lt;time.h&gt; #include &lt;sys/mman.h&gt; const int KB = 1024; const int MB = 1024 * KB; const int data_size = 32 * MB; const int repeats = 64 * MB; const int steps = 8 * MB; const int times = 8; long long clock_time() { struct timespec tp; clock_gettime(CLOCK_REALTIME, &amp;tp); return (long long)(tp.tv_nsec + (long long)tp.tv_sec * 1000000000ll); } int main() { // allocate memory and lock void* map = mmap(NULL, (size_t)data_size, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE, 0, 0); if (map == MAP_FAILED) { return 0; } int* data = (int*)map; // write all to avoid paging on demand for (int i = 0;i&lt; data_size / sizeof(int);i++) { data[i]++; } int steps[] = { 1*KB, 4*KB, 8*KB, 16*KB, 24 * KB, 32*KB, 64*KB, 128*KB, 128*KB*2, 128*KB*3, 512*KB, 1 * MB, 2 * MB, 3 * MB, 4 * MB, 5 * MB, 6 * MB, 7 * MB, 8 * MB, 9 * MB}; for (int i = 0; i &lt;= sizeof(steps) / sizeof(int) - 1; i++) { double totalTime = 0; for (int k = 0; k &lt; times; k++) { int size_mask = steps[i] / sizeof(int) - 1; long long start = clock_time(); for (int j = 0; j &lt; repeats; j++) { ++data[ (j * 16) &amp; size_mask ]; } long long end = clock_time(); totalTime += (end - start) / 1000000000.0; } printf("%d time: %lf\n", steps[i] / KB, totalTime); } munmap(map, (size_t)data_size); return 0; } </code></pre> <p>However, the result is so weird:</p> <pre><code>1 time: 1.989998 4 time: 1.992945 8 time: 1.997071 16 time: 1.993442 24 time: 1.994212 32 time: 2.002103 64 time: 1.959601 128 time: 1.957994 256 time: 1.975517 384 time: 1.975143 512 time: 2.209696 1024 time: 2.437783 2048 time: 7.006168 3072 time: 5.306975 4096 time: 5.943510 5120 time: 2.396078 6144 time: 4.404022 7168 time: 4.900366 8192 time: 8.998624 9216 time: 6.574195 </code></pre> <p>My CPU is Intel(R) Core(TM) i3-2350M. L1 Cache: 32K (for data), L2 Cache 256K, L3 Cache 3072K. Seems like it doesn't follow any rule. I can't get information of cache size or cache level from that. Could anybody give some help? Thanks in advance.</p> <p><strong>Update:</strong> Follow @Leeor advice, I use <code>j*64</code> instead of <code>j*16</code>. New results:</p> <pre><code>1 time: 1.996282 4 time: 2.002579 8 time: 2.002240 16 time: 1.993198 24 time: 1.995733 32 time: 2.000463 64 time: 1.968637 128 time: 1.956138 256 time: 1.978266 384 time: 1.991912 512 time: 2.192371 1024 time: 2.262387 2048 time: 3.019435 3072 time: 2.359423 4096 time: 5.874426 5120 time: 2.324901 6144 time: 4.135550 7168 time: 3.851972 8192 time: 7.417762 9216 time: 2.272929 10240 time: 3.441985 11264 time: 3.094753 </code></pre> <p>Two peaks, 4096K and 8192K. Still weird.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload