Note that there are some explanatory texts on larger screens.

plurals
  1. POManually optimize a nested loop
    primarykey
    data
    text
    <p>I'm working on a homework assignment where I must manually optimize a nested loop (my program will be compiled with optimizations disabled). The goal of the assignment is to run the entire program in less than 6 seconds (extra credit for less than 4.5 seconds).</p> <p>I'm only allowed to change a small block of code, and the starting point is such:</p> <pre><code> for (j=0; j &lt; ARRAY_SIZE; j++) { sum += array[j]; } </code></pre> <p>Where <code>ARRAY_SIZE</code> is 9973. This loop is contained within another loop that is run 200,000 times. This particular version runs in 16 seconds.</p> <p>What I've done so far is change the implementation to unroll the loop and use pointers as my iterator:</p> <p>(These declarations are not looped over 200,000 times)</p> <pre><code> register int unroll_length = 16; register int *unroll_end = array + (ARRAY_SIZE - (ARRAY_SIZE % unroll_length)); register int *end = array + (ARRAY_SIZE -1); register int *curr_end; curr_end = end; while (unroll_end != curr_end) { sum += *curr_end; curr_end--; } do { sum += *curr_end + *(curr_end-1) + *(curr_end-2) + *(curr_end-3) + *(curr_end-4) + *(curr_end-5) + *(curr_end-6) + *(curr_end-7) + *(curr_end-8) + *(curr_end-9) + *(curr_end-10) + *(curr_end-11) + *(curr_end-12) + *(curr_end-13) + *(curr_end-14) + *(curr_end-15); } while ((curr_end -= unroll_length) != array); sum += *curr_end; </code></pre> <p>Using these techniques, I was able to get the execution down to 5.5 seconds, which will give me full credit. However; I sure do want to earn the extra credit, but I'm also curious what additional optimizations I can make that I might be overlooking?</p> <p>Edit #1 (Adding outer loop)</p> <pre><code> srand(time(NULL)); for(j = 0; j &lt; ARRAY_SIZE; j++) { x = rand() / (int)(((unsigned)RAND_MAX + 1) / 14); array[j] = x; checksum += x; } for (i = 0; i &lt; N_TIMES; i++) { // inner loop goes here if (sum != checksum) printf("Checksum error!\n"); sum = 0; } </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload