Note that there are some explanatory texts on larger screens.

plurals
  1. PORules for using the restrict keyword in C?
    text
    copied!<p>I'm trying to understand when and when not to use the <code>restrict</code> keyword in C and in what situations it provides a tangible benefit.</p> <p>After reading, "<a href="http://cellperformance.beyond3d.com/articles/2006/05/demystifying-the-restrict-keyword.html" rel="nofollow noreferrer">Demystifying The Restrict Keyword</a>", ( which provides some rules of thumb on usage ), I get the impression that when a function is passed pointers, it has to account for the possibility that the data pointed to might overlap (alias) with any other arguments being passed into the function. Given a function:</p> <pre><code>foo(int *a, int *b, int *c, int n) { for (int i = 0; i&lt;n; ++i) { b[i] = b[i] + c[i]; a[i] = a[i] + b[i] * c[i]; } } </code></pre> <p>the compiler has to reload <code>c</code> in the second expression, because maybe <code>b</code> and <code>c</code> point to the same location. It also has to wait for <code>b</code> to be stored before it can load <code>a</code> for the same reason. It then has to wait for <code>a</code> to be stored and must reload <code>b</code> and <code>c</code> at the beginning of the next loop. If you call the function like this:</p> <pre><code>int a[N]; foo(a, a, a, N); </code></pre> <p>then you can see why the compiler has to do this. Using <code>restrict</code> effectively tells the compiler that you will never do this, so that it can drop the redundant load of <code>c</code> and load <code>a</code> before <code>b</code> is stored.</p> <p><a href="https://stackoverflow.com/questions/1965487/does-the-restrict-keyword-provide-significant-anti-aliasing-benefits-in-gcc-g/1966649#1966649">In a different SO post, Nils Pipenbrinck, provides a working example of this scenario demonstrating the performance benefit.</a></p> <p>So far I've gathered that it's a good idea to use <code>restrict</code> on pointers you pass into functions which won't be inlined. Apparently if the code is inlined the compiler can figure out that the pointers don't overlap.</p> <p><strong>Now here's where things start getting fuzzy for me.</strong> </p> <p>In Ulrich Drepper's paper, "<a href="http://lwn.net/Articles/250967/" rel="nofollow noreferrer">What every programmer should know about memory</a>" he makes the statement that, "unless restrict is used, all pointer accesses are potential sources of aliasing," and he gives a specific code example of a submatrix matrix multiply where he uses <code>restrict</code>. </p> <p>However, when I compile his example code either with or without <code>restrict</code> I get identical binaries in both cases. I'm using <code>gcc version 4.2.4 (Ubuntu 4.2.4-1ubuntu4)</code></p> <p>The thing I can't figure out in the following code is whether it needs to be rewritten to make more extensive use of <code>restrict</code>, or if the alias analysis in GCC is just so good that it's able to figure out that none of the arguments alias each other. <strong>For purely educational purposes, how can I make using or not using <code>restrict</code> matter in this code - and why?</strong></p> <p>For <code>restrict</code> compiled with:</p> <pre><code>gcc -DCLS=$(getconf LEVEL1_DCACHE_LINESIZE) -DUSE_RESTRICT -Wextra -std=c99 -O3 matrixMul.c -o matrixMul </code></pre> <p>Just remove <code>-DUSE_RESTRICT</code> to not use <code>restrict</code>.</p> <pre><code>#include &lt;stdlib.h&gt; #include &lt;stdio.h&gt; #include &lt;emmintrin.h&gt; #ifdef USE_RESTRICT #else #define restrict #endif #define N 1000 double _res[N][N] __attribute__ ((aligned (64))); double _mul1[N][N] __attribute__ ((aligned (64))) = { [0 ... (N-1)] = { [0 ... (N-1)] = 1.1f }}; double _mul2[N][N] __attribute__ ((aligned (64))) = { [0 ... (N-1)] = { [0 ... (N-1)] = 2.2f }}; #define SM (CLS / sizeof (double)) void mm(double (* restrict res)[N], double (* restrict mul1)[N], double (* restrict mul2)[N]) __attribute__ ((noinline)); void mm(double (* restrict res)[N], double (* restrict mul1)[N], double (* restrict mul2)[N]) { int i, i2, j, j2, k, k2; double *restrict rres; double *restrict rmul1; double *restrict rmul2; for (i = 0; i &lt; N; i += SM) for (j = 0; j &lt; N; j += SM) for (k = 0; k &lt; N; k += SM) for (i2 = 0, rres = &amp;res[i][j], rmul1 = &amp;mul1[i][k]; i2 &lt; SM; ++i2, rres += N, rmul1 += N) for (k2 = 0, rmul2 = &amp;mul2[k][j]; k2 &lt; SM; ++k2, rmul2 += N) for (j2 = 0; j2 &lt; SM; ++j2) rres[j2] += rmul1[k2] * rmul2[j2]; } int main (void) { mm(_res, _mul1, _mul2); return 0; } </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload