Note that there are some explanatory texts on larger screens.

plurals
  1. POIs this an optimization bug in g++?
    primarykey
    data
    text
    <p>I'm not sure whether I've found a bug in g++ (4.4.1-4ubuntu9), or if I'm doing something wrong. What I believe I'm seeing is a bug introduced by enabling optimization with g++ -O2. I've tried to distill the code down to just the relevant parts.</p> <p>When optimization is enabled, I have an ASSERT which is failing. When optimization is disabled, the same ASSERT does not fail. I think I've tracked it down to the optimization of one function and its callers.</p> <h2>The System</h2> <ul> <li>Language: C++</li> <li>Ubuntu 9.10</li> <li>g++-4.4.real (Ubuntu 4.4.1-4ubuntu9) 4.4.1</li> <li>Linux 2.6.31-22-server x86_64</li> </ul> <p><br></p> <h2>Optimization Enabled</h2> <p>Object compiled with: <code>g++ -DHAVE_CONFIG_H -I. -fPIC -g -O2 -MT file.o -MD -MP -MF .deps/file.Tpo -c -o file.o file.cpp</code></p> <p>And here is the relevant code from <code>objdump -dg file.o</code>.</p> <pre><code>00000000000018b0 &lt;helper_function&gt;: ;; This function takes two parameters: ;; pointer to int: %rdi ;; pointer to int[]: %rsi 18b0: 0f b6 07 movzbl (%rdi),%eax 18b3: 83 f8 12 cmp $0x12,%eax 18b6: 74 60 je 1918 &lt;helper_function+0x68&gt; 18b8: 83 f8 17 cmp $0x17,%eax 18bb: 74 5b je 1918 &lt;helper_function+0x68&gt; ... 1918: c7 06 32 00 00 00 movl $0x32,(%rsi) 191e: 66 90 xchg %ax,%ax 1920: c3 retq 0000000000005290 &lt;buggy_invoker&gt;: ... snip ... 52a0: 48 81 ec c8 01 00 00 sub $0x1c8,%rsp 52a7: 48 8d 84 24 a0 01 00 lea 0x1a0(%rsp),%rax 52ae: 00 52af: 48 c7 84 24 a0 01 00 movq $0x0,0x1a0(%rsp) 52b6: 00 00 00 00 00 52bb: 48 c7 84 24 a8 01 00 movq $0x0,0x1a8(%rsp) 52c2: 00 00 00 00 00 52c7: c7 84 24 b0 01 00 00 movl $0x0,0x1b0(%rsp) 52ce: 00 00 00 00 52d2: 4c 8d 7c 24 20 lea 0x20(%rsp),%r15 52d7: 48 89 c6 mov %rax,%rsi 52da: 48 89 44 24 08 mov %rax,0x8(%rsp) ;; ***** BUG HERE ***** ;; Pointer to int[] loaded into %rsi ;; But where is %rdi populated? 52df: e8 cc c5 ff ff callq 18b0 &lt;helper_function&gt; 0000000000005494 &lt;perfectly_fine_invoker&gt;: 5494: 48 83 ec 20 sub $0x20,%rsp 5498: 0f ae f0 mfence 549b: 48 8d 7c 24 30 lea 0x30(%rsp),%rdi 54a0: 48 89 e6 mov %rsp,%rsi 54a3: 48 c7 04 24 00 00 00 movq $0x0,(%rsp) 54aa: 00 54ab: 48 c7 44 24 08 00 00 movq $0x0,0x8(%rsp) 54b2: 00 00 54b4: c7 44 24 10 00 00 00 movl $0x0,0x10(%rsp) 54bb: 00 ;; Non buggy invocation here: both %rdi and %rsi loaded correctly. 54bc: e8 ef c3 ff ff callq 18b0 &lt;helper_function&gt; </code></pre> <h2>Optimization Disabled</h2> <p>Now compiled with: <code>g++ -DHAVE_CONFIG_H -I. -fPIC -g -O0 -MT file.o -MD -MP -MF .deps/file.Tpo -c -o file.o file.cpp</code></p> <pre><code>0000000000008d27 &lt;helper_function&gt;: ;; Still the same parameters here, but it looks a little different. ... snip ... 8d2b: 48 89 7d e8 mov %rdi,-0x18(%rbp) 8d2f: 48 89 75 e0 mov %rsi,-0x20(%rbp) 8d33: 48 8b 45 e8 mov -0x18(%rbp),%rax 8d37: 0f b6 00 movzbl (%rax),%eax 8d3a: 0f b6 c0 movzbl %al,%eax 8d3d: 89 45 fc mov %eax,-0x4(%rbp) 8d40: 8b 45 fc mov -0x4(%rbp),%eax 8d43: 83 f8 17 cmp $0x17,%eax 8d46: 74 40 je 8d88 &lt;helper_function+0x61&gt; ... 000000000000948a &lt;buggy_invoker&gt;: 948a: 55 push %rbp 948b: 48 89 e5 mov %rsp,%rbp 948e: 41 54 push %r12 9490: 53 push %rbx 9491: 48 81 ec c0 01 00 00 sub $0x1c0,%rsp 9498: 48 89 bd 38 fe ff ff mov %rdi,-0x1c8(%rbp) 949f: 48 89 b5 30 fe ff ff mov %rsi,-0x1d0(%rbp) 94a6: 48 c7 45 c0 00 00 00 movq $0x0,-0x40(%rbp) 94ad: 00 94ae: 48 c7 45 c8 00 00 00 movq $0x0,-0x38(%rbp) 94b5: 00 94b6: c7 45 d0 00 00 00 00 movl $0x0,-0x30(%rbp) 94bd: 48 8d 55 c0 lea -0x40(%rbp),%rdx 94c1: 48 8b 85 38 fe ff ff mov -0x1c8(%rbp),%rax 94c8: 48 89 d6 mov %rdx,%rsi 94cb: 48 89 c7 mov %rax,%rdi ;; ***** NOT BUGGY HERE ***** ;; Now, without optimization, both %rdi and %rsi loaded correctly. 94ce: e8 54 f8 ff ff callq 8d27 &lt;helper_function&gt; 0000000000008eec &lt;different_perfectly_fine_invoker&gt;: 8eec: 55 push %rbp 8eed: 48 89 e5 mov %rsp,%rbp 8ef0: 48 83 ec 30 sub $0x30,%rsp 8ef4: 48 89 7d d8 mov %rdi,-0x28(%rbp) 8ef8: 48 c7 45 e0 00 00 00 movq $0x0,-0x20(%rbp) 8eff: 00 8f00: 48 c7 45 e8 00 00 00 movq $0x0,-0x18(%rbp) 8f07: 00 8f08: c7 45 f0 00 00 00 00 movl $0x0,-0x10(%rbp) 8f0f: 48 8d 55 e0 lea -0x20(%rbp),%rdx 8f13: 48 8b 45 d8 mov -0x28(%rbp),%rax 8f17: 48 89 d6 mov %rdx,%rsi 8f1a: 48 89 c7 mov %rax,%rdi ;; Another example of non-optimized call to that function. 8f1d: e8 05 fe ff ff callq 8d27 &lt;helper_function&gt; </code></pre> <h2>The Original C++ Code</h2> <p>This is a sanitized version of the original C++. I've just changed some names and removed irrelevant code. Forgive my paranoia, I just don't want to expose too much code from unpublished and unreleased work :-).</p> <pre><code>static void helper_function(my_struct_t *e, int *outArr) { unsigned char event_type = e-&gt;header.type; if (event_type == event_A || event_type == event_B) { outArr[0] = action_one; } else if (event_type == event_C) { outArr[0] = action_one; outArr[1] = action_two; } else if (...) { ... } } static void buggy_invoker(my_struct_t *e, predicate_t pred) { // MAX_ACTIONS is #defined to 5 int action_array[MAX_ACTIONS] = {0}; helper_function(e, action_array); ... } static int has_any_actions(my_struct_t *e) { int actions[MAX_ACTIONS] = {0}; helper_function(e, actions); return actions[0] != 0; } // *** ENTRY POINT to this code is this function (note not static). void perfectly_fine_invoker(my_struct_t e, predicate_t pred) { memfence(); if (has_any_actions(&amp;e)) { buggy_invoker(&amp;e, pred); } ... } </code></pre> <p>If you think I've obfuscated or eliminiated too much, let me know. Users of this code call 'perfectly_fine_invoker'. With optimization, g++ optimizes the 'has_any_actions' function away into a direct call to 'helper_function', which you can see in the assembly.</p> <h2>The Question</h2> <p>So, my question is, does it look like a buggy optimization to anyone else?</p> <p>If it would be helpful, I could post a sanitized version of the original C++ code.</p> <p>This is my first posting to Stack Overflow, so please let me know if I can do anything to make the question clearer, or provide any additional information.</p> <h2>The Answer</h2> <p>Edit (several days after the fact):</p> <p>I accepted an answer below to my question -- it was not an optimization bug in g++, I was just looking at the assembly code wrong.</p> <p>However, for whoever may be viewing this question in the future, I've found the answer. I did some reading on undefined behavior in C ( <a href="http://blog.regehr.org/archives/213" rel="nofollow">http://blog.regehr.org/archives/213</a> and <a href="http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html" rel="nofollow">http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html</a> ) and some of the descriptions of the compiler optimizing away functions with undefined behavior seemed eerily familiar.</p> <p>I added some NULL-pointer checks to the function 'helper_function' and lo and behold... bug goes away. I should have had the NULL-pointer checks to begin with, but apparently not having them allowed g++ to do whatever it wanted (in my case, optimize away the call).</p> <p>Hope this information helps someone down the road.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload