Note that there are some explanatory texts on larger screens.

plurals
  1. POEffectiveness of GCC optmization on bit operations
    primarykey
    data
    text
    <p>Here are two ways to set an individual bit in C on x86-64:</p> <pre><code>inline void SetBitC(long *array, int bit) { //Pure C version *array |= 1&lt;&lt;bit; } inline void SetBitASM(long *array, int bit) { // Using inline x86 assembly asm("bts %1,%0" : "+r" (*array) : "g" (bit)); } </code></pre> <p>Using GCC 4.3 with <code>-O3 -march=core2</code> options, the C version takes about <strong>90% more time</strong> when used with a constant <code>bit</code>. (Both versions compile to exactly the same assembly code, except that the C version uses an <code>or [1&lt;&lt;num],%rax</code> instruction instead of a <code>bts [num],%rax</code> instruction)</p> <p>When used with a variable <code>bit</code>, the C version performs better but is still significantly slower than the inline assembly.</p> <p>Resetting, toggling and checking bits have similar results.</p> <p>Why does GCC optimize so poorly for such a common operation? Am I doing something wrong with the C version?</p> <p><strong>Edit:</strong> Sorry for the long wait, here is the code I used to benchmark. It actually started as a simple programming problem...</p> <pre><code>int main() { // Get the sum of all integers from 1 to 2^28 with bit 11 always set unsigned long i,j,c=0; for (i=1; i&lt;(1&lt;&lt;28); i++) { j = i; SetBit(&amp;j, 10); c += j; } printf("Result: %lu\n", c); return 0; } gcc -O3 -march=core2 -pg test.c ./a.out gprof with ASM: 101.12 0.08 0.08 main with C: 101.12 0.16 0.16 main </code></pre> <p><code>time ./a.out</code> also gives similar results.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload