Note that there are some explanatory texts on larger screens.

plurals
  1. POC++: Improving cache performance in a 3d array
    text
    copied!<p>I don't know how to optimize cache performance at a really low level, thinking about cache-line size or associativity. That's not something you can learn overnight. Considering my program will run on many different systems and architectures, I don't think it would be worth it anyway. But still, there are probably some steps I can take to reduce cache misses in general.</p> <p>Here is a description of my problem:</p> <p>I have a 3d array of integers, representing values at points in space, like [x][y][z]. Each dimension is the same size, so it's like a cube. From that I need to make another 3d array, where each value in this new array is a function of 7 parameters: the corresponding value in the original 3d array, plus the 6 indices that "touch" it in space. I'm not worried about the edges and corners of the cube for now.</p> <p>Here is what I mean in C++ code:</p> <pre><code>void process3DArray (int input[LENGTH][LENGTH][LENGTH], int output[LENGTH][LENGTH][LENGTH]) { for(int i = 1; i &lt; LENGTH-1; i++) for (int j = 1; j &lt; LENGTH-1; j++) for (int k = 1; k &lt; LENGTH-1; k++) //The for loops start at 1 and stop before LENGTH-1 //or other-wise I'll get out-of-bounds errors //I'm not concerned with the edges and corners of the //3d array "cube" at the moment. { int value = input[i][j][k]; //I am expecting crazy cache misses here: int posX = input[i+1] [j] [k]; int negX = input[i-1] [j] [k]; int posY = input[i] [j+1] [k]; int negY = input[i] [j-1] [k]; int posZ = input[i] [j] [k+1]; int negZ = input[i] [j] [k-1]; output [i][j][k] = process(value, posX, negX, posY, negY, posZ, negZ); } } </code></pre> <p>However, it seems like if LENGTH is large enough, I'll get tons of cache misses when I'm fetching the parameters for <code>process</code>. Is there a cache-friendlier way to do this, or a better way to represent my data other than a 3d array?</p> <p>And if you have the time to answer these extra questions, do I have to consider the value of LENGTH? Like it's different whether LENGTH is 20 vs 100 vs 10000. Also, would I have to do something else if I used something other than integers, like maybe a 64-byte struct?</p> <p>@ ildjarn:</p> <p>Sorry, I did not think that the code that generates the arrays I am passing into <code>process3DArray</code> mattered. But if it does, I would like to know why.</p> <pre><code>int main() { int data[LENGTH][LENGTH][LENGTH]; for(int i = 0; i &lt; LENGTH; i++) for (int j = 0; j &lt; LENGTH; j++) for (int k = 0; k &lt; LENGTH; k++) data[i][j][k] = rand() * (i + j + k); int result[LENGTH][LENGTH][LENGTH]; process3DArray(data, result); } </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload