Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>Actually gcc will use the carry automatically if you write your code carefully...</p> <p>I compiled this code with <code>gcc -O2 -Wall -Werror -S</code>:</p> <pre><code>void increment128_1(unsigned long &amp;hiWord, unsigned long &amp;loWord) { const unsigned long hiAdd=0x0000062DE49B5241; const unsigned long loAdd=0x85DC198BCDD714BA; loWord += loAdd; if (loWord &lt; loAdd) ++hiWord; // test_and_add_carry hiWord += hiAdd; } void increment128_2(unsigned long &amp;hiWord, unsigned long &amp;loWord) { const unsigned long hiAdd=0x0000062DE49B5241; const unsigned long loAdd=0x85DC198BCDD714BA; loWord += loAdd; hiWord += hiAdd; hiWord += (loWord &lt; loAdd); // test_and_add_carry } </code></pre> <p>This is the assembly for increment128_1:</p> <pre><code>.cfi_startproc movabsq $-8801131483544218438, %rax addq (%rsi), %rax movabsq $-8801131483544218439, %rdx cmpq %rdx, %rax movq %rax, (%rsi) ja .L5 movq (%rdi), %rax addq $1, %rax .L3: movabsq $6794178679361, %rdx addq %rdx, %rax movq %rax, (%rdi) ret </code></pre> <p>...and this is the assembly for increment128_2:</p> <pre><code> movabsq $-8801131483544218438, %rax addq %rax, (%rsi) movabsq $6794178679361, %rax addq (%rdi), %rax movabsq $-8801131483544218439, %rdx movq %rax, (%rdi) cmpq %rdx, (%rsi) setbe %dl movzbl %dl, %edx leaq (%rdx,%rax), %rax movq %rax, (%rdi) ret </code></pre> <p>Note the lack of conditional branches in the second version.</p> <p>[edit]</p> <p>Also, references are often bad for performance, because GCC has to worry about aliasing... It is often better to just pass things by value. Consider:</p> <pre><code>struct my_uint128_t { unsigned long hi; unsigned long lo; }; my_uint128_t increment128_3(my_uint128_t x) { const unsigned long hiAdd=0x0000062DE49B5241; const unsigned long loAdd=0x85DC198BCDD714BA; x.lo += loAdd; x.hi += hiAdd + (x.lo &lt; loAdd); return x; } </code></pre> <p>Assembly:</p> <pre><code> .cfi_startproc movabsq $-8801131483544218438, %rdx movabsq $-8801131483544218439, %rax movabsq $6794178679362, %rcx addq %rsi, %rdx cmpq %rdx, %rax sbbq %rax, %rax addq %rcx, %rax addq %rdi, %rax ret </code></pre> <p>This is actually the tightest code of the three.</p> <p>...OK so none of them actually used the carry automatically :-). But they do avoid the conditional branch, which I bet is the slow part (since the branch prediction logic will get it wrong half the time).</p> <p>[edit 2]</p> <p>And one more, which I stumbled across doing a little searching. Did you know GCC has built-in support for 128-bit integers?</p> <pre><code>typedef unsigned long my_uint128_t __attribute__ ((mode(TI))); my_uint128_t increment128_4(my_uint128_t x) { const my_uint128_t hiAdd=0x0000062DE49B5241; const unsigned long loAdd=0x85DC198BCDD714BA; return x + (hiAdd &lt;&lt; 64) + loAdd; } </code></pre> <p>The assembly for this one is about as good as it gets:</p> <pre><code> .cfi_startproc movabsq $-8801131483544218438, %rax movabsq $6794178679361, %rdx pushq %rbx .cfi_def_cfa_offset 16 addq %rdi, %rax adcq %rsi, %rdx popq %rbx .cfi_offset 3, -16 .cfi_def_cfa_offset 8 ret </code></pre> <p>(Not sure where the push/pop of <code>ebx</code> came from, but this is still not bad.)</p> <p>All of these are with GCC 4.5.2, by the way.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload