Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>If you use SSE instructions, you're obviously limited to processors that support these. That means x86, dating back to the Pentium 2 or so (can't remember exactly when they were introduced, but it's a long time ago)</p> <p>SSE2, which, as far as I can recall, is the one that offers integer operations, is somewhat more recent (Pentium 3? Although the first AMD Athlon processors didn't support them)</p> <p>In any case, you have two options for using these instructions. Either write the entire block of code in assembly (probably a bad idea. That makes it virtually impossible for the compiler to optimize your code, and it's very hard for a human to write efficient assembler).</p> <p>Alternatively, use the intrinsics available with your compiler (if memory serves, they're usually defined in xmmintrin.h)</p> <p>But again, the performance may not improve. SSE code poses additional requirements of the data it processes. Mainly, the one to keep in mind is that data must be aligned on 128-bit boundaries. There should also be few or no dependencies between the values loaded into the same register (a 128 bit SSE register can hold 4 ints. Adding the first and the second one together is not optimal. But adding all four ints to the corresponding 4 ints in another register will be fast)</p> <p>It may be tempting to use a library that wraps all the low-level SSE fiddling, but that might also ruin any potential performance benefit.</p> <p>I don't know how good SSE's integer operation support is, so that may also be a factor that can limit performance. SSE is mainly targeted at speeding up floating point operations.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload