Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Sorry don't know of a tutorial.</p> <p>Your best bet (IMHO) is to use SSE via the "intrinsic" functions Intel provides to wrap (generally) single SSE instructions. These are made available via a set of include files named *mmintrin.h e.g xmmintrin.h is the original SSE instruction set.</p> <p>Begin familiar with the contents of Intel's Optimization <a href="http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-optimization-manual.html" rel="nofollow noreferrer">Reference Manual</a> is a good idea (see section 4.3.1.2 for an example of intrinsics) and the SIMD sections are essential reading. The instruction set reference manuals are pretty helpful too, in that each instruction's documentation includes the "intrinsic" function it corresponds to.</p> <p><em>Do</em> spend some time inspecting the assembler produced by the compiler from intrinsics (you'll learn a lot) and on profiling/performance measurement (you'll avoid wasting time SSE-ing code for little return on the effort).</p> <p><strong>Update 2011-05-31:</strong> There is some very nice coverage of intrinsics and vectorization in Agner Fog's <a href="http://agner.org/optimize/" rel="nofollow noreferrer">optimization PDFs</a> (<a href="https://stackoverflow.com/questions/695222/code-optimization-bibles/695293#695293">thanks</a>) although it's a bit spread about (e.g section 12 of the <a href="http://agner.org/optimize/optimizing_cpp.pdf" rel="nofollow noreferrer">first one</a> and section 5 of the <a href="http://agner.org/optimize/optimizing_assembly.pdf" rel="nofollow noreferrer">second one</a>). These aren't exactly tutorial material (in fact there's a "these manuals are not for beginners" warning) but they do rightly treat SIMD (whether used via asm, intrinsics or compiler vectorization) as just one part of the larger optimization toolbox.</p> <p><strong>Update 2012-10-04:</strong> A <a href="http://www.linuxjournal.com/content/introduction-gcc-compiler-intrinsics-vector-processing" rel="nofollow noreferrer">nice little Linux Journal article</a> on gcc vector intrinsics deserves a mention here. More general than just SSE (covers PPC and ARM extensions too). There's a good collection of references on the <a href="http://www.linuxjournal.com/content/introduction-gcc-compiler-intrinsics-vector-processing?page=0,4" rel="nofollow noreferrer">last page</a>, which drew my attention to Intel's <a href="http://software.intel.com/sites/default/files/m/9/4/c/8/e/18072-347603.pdf" rel="nofollow noreferrer">"intrinsics manual"</a>.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload