Note that there are some explanatory texts on larger screens.

plurals
  1. POWhy is my program slow when looping over exactly 8192 elements?
    primarykey
    data
    text
    <p>Here is the extract from the program in question. The matrix <code>img[][]</code> has the size SIZE×SIZE, and is initialized at:</p> <p><code>img[j][i] = 2 * j + i</code></p> <p>Then, you make a matrix <code>res[][]</code>, and each field in here is made to be the average of the 9 fields around it in the img matrix. The border is left at 0 for simplicity.</p> <pre><code>for(i=1;i&lt;SIZE-1;i++) for(j=1;j&lt;SIZE-1;j++) { res[j][i]=0; for(k=-1;k&lt;2;k++) for(l=-1;l&lt;2;l++) res[j][i] += img[j+l][i+k]; res[j][i] /= 9; } </code></pre> <p>That's all there's to the program. For completeness' sake, here is what comes before. No code comes after. As you can see, it's just initialization.</p> <pre><code>#define SIZE 8192 float img[SIZE][SIZE]; // input image float res[SIZE][SIZE]; //result of mean filter int i,j,k,l; for(i=0;i&lt;SIZE;i++) for(j=0;j&lt;SIZE;j++) img[j][i] = (2*j+i)%8196; </code></pre> <p>Basically, this program is slow when SIZE is a multiple of 2048, e.g. the execution times:</p> <pre><code>SIZE = 8191: 3.44 secs SIZE = 8192: 7.20 secs SIZE = 8193: 3.18 secs </code></pre> <p>The compiler is GCC. From what I know, this is because of memory management, but I don't really know too much about that subject, which is why I'm asking here.</p> <p>Also how to fix this would be nice, but if someone could explain these execution times I'd already be happy enough.</p> <p>I already know of malloc/free, but the problem is not amount of memory used, it's merely execution time, so I don't know how that would help.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload