Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>The first two cases you are exposing in your question are about the same. Things would really change in the following two cases:</p> <p>CASE 1:</p> <pre><code>for(int i = 0; i &lt; 10; i++) { for(int j = 0; j &lt; 1000; j++) { b[i] += a[i]*a[j]; } } </code></pre> <p>Here you are accessing the matrix "a" as follows: a[0]*a[0], a[0]*a<a href="https://stackoverflow.com/questions/5444303/loop-unrolling-vs-loop-tiling">1</a>, a[0]*a[2],.... In most architectures, matrix structures are stored in memory like: a[0]*a[0], a<a href="https://stackoverflow.com/questions/5444303/loop-unrolling-vs-loop-tiling">1</a>*a[0], a[2]*a[0] (first column of first row followed by second column of first raw,....). Imagine your cache only could store 5 elements and your matrix is 6x6. The first "pack" of elements that would be stored in cache would be a[0]*a[0] to a[4]*a[0]. Your first acces would cause no cache miss so a[0][0] is stored in cache but the second yes!! a<a href="https://stackoverflow.com/questions/5444303/loop-unrolling-vs-loop-tiling">0</a> is not stored in cache! Then the OS would bring to cache the pack of elements a<a href="https://stackoverflow.com/questions/5444303/loop-unrolling-vs-loop-tiling">0</a> to a<a href="https://stackoverflow.com/questions/5444303/loop-unrolling-vs-loop-tiling">4</a>. Then you do the third acces: a[0]*a[2] wich is out of cache again. Another cache miss!</p> <p>As you can colcude, case 1 is not a good solution for the problem. It causes lots of cache misses that we can avoid changing the code for the following:</p> <p>CASE 2:</p> <pre><code>for(int i = 0; i &lt; 10; i++) { for(int j = 0; j &lt; 1000; j++) { b[i] += a[i]*a[j]; } } </code></pre> <p>Here, as you can see, we are accessing the matrix as it's stored in memory. Consequently it's much better (faster) than case 1.</p> <p>About the third code you posted about loop tiling, loop tiling and also loop unrolling are optimizations that in most cases the compiler does automaticaly. <a href="https://stackoverflow.com/questions/5444303/loop-unrolling-vs-loop-tiling">Here's a very interesting post in stackoverflow explaining these two techniques;</a></p> <p>Hope it helps! (sorry about my english, I'm not a native speaker)</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload