Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>The first obvious thing would be to use the <code>restrict</code> keyword.</p> <p>As it is now, <code>a</code> and <code>b</code> are aliasable (and thus, from the compiler's point of view which assumes the worst possible case <em>are</em> aliased). No compiler will auto-vectorize this, as it is wrong to do so.</p> <p>Worse, not only can the compiler not vectorize such a loop, in case you also store (luckily not in your example), it <em>also</em> must re-load values each time. Always be clear about aliasing, as it greatly impacts the compiler.</p> <p>Next, if you can live with that, use <code>float</code> instead of <code>double</code> and pad to 4 floats even if one is unused, this is a more "natural" data layout for the majority of CPUs (this is somewhat platform specific, but 4 floats is a good guess for most platforms -- 3 doubles, a.k.a. 1.5 SIMD registers on "typical" CPUs, is not optimal anywhere). </p> <p>(For a hand-written SIMD implementation (which is harder than you think), first and before all be sure to have aligned data. Next, look into what latencies your instrucitons have on the target machine and do the longest ones first. For example on pre-Prescott Intel it makes sense to first shuffle each component into a register and then multiply with itself, even though that uses 3 multiplies instead of one, because shuffles have a long latency. On the later models, a shuffle takes a single cycle, so that would be a total anti-optimization.<br> Which again shows that leaving it to the compiler is not such a bad idea.)</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload