Note that there are some explanatory texts on larger screens.

plurals
  1. POHow does GCC optimize out an unused variable incremented inside a loop?
    primarykey
    data
    text
    <p>I wrote this simple C program:</p> <pre class="lang-c prettyprint-override"><code>int main() { int i; int count = 0; for(i = 0; i &lt; 2000000000; i++){ count = count + 1; } } </code></pre> <p>I wanted to see how the gcc compiler optimizes this loop (clearly add <em>1</em> 2000000000 times should be "add <em>2000000000</em> one time"). So:</p> <p><strong>gcc test.c</strong> and then <code>time</code> on <code>a.out</code> gives:</p> <pre><code>real 0m7.717s user 0m7.710s sys 0m0.000s </code></pre> <p><strong>$ gcc -O2 test.c</strong> and then <code>time on</code>a.out` gives: </p> <pre><code>real 0m0.003s user 0m0.000s sys 0m0.000s </code></pre> <p>Then I disassembled both with <code>gcc -S</code>. First one seems quite clear:</p> <pre><code> .file "test.c" .text .globl main .type main, @function main: .LFB0: .cfi_startproc pushq %rbp .cfi_def_cfa_offset 16 movq %rsp, %rbp .cfi_offset 6, -16 .cfi_def_cfa_register 6 movl $0, -8(%rbp) movl $0, -4(%rbp) jmp .L2 .L3: addl $1, -8(%rbp) addl $1, -4(%rbp) .L2: cmpl $1999999999, -4(%rbp) jle .L3 leave .cfi_def_cfa 7, 8 ret .cfi_endproc .LFE0: .size main, .-main .ident "GCC: (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2" .section .note.GNU-stack,"",@progbits </code></pre> <p>L3 adds, L2 compare <code>-4(%rbp)</code> with <code>1999999999</code> and loops to L3 if <code>i &lt; 2000000000</code>.</p> <p><strong>Now the optimized one:</strong></p> <pre><code> .file "test.c" .text .p2align 4,,15 .globl main .type main, @function main: .LFB0: .cfi_startproc rep ret .cfi_endproc .LFE0: .size main, .-main .ident "GCC: (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2" .section .note.GNU-stack,"",@progbits </code></pre> <p>I can't understand at all what's going on there! I've got little knowledge of assembly, but I expected something like</p> <pre><code>addl $2000000000, -8(%rbp) </code></pre> <p>I even tried with <strong>gcc -c -g -Wa,-a,-ad -O2 test.c</strong> to see the C code together with the assembly it was converted to, but the result was no more clear that the previous one. </p> <p><strong>Can someone briefly explain:</strong></p> <ol> <li>The <strong>gcc -S -O2</strong> output.</li> <li>If the loop is optimized as I expected (one sum instead of many sums)?</li> </ol>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload