StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
text
Body
copied!<p>These two statements accomplish nothing, because <code>DBL_EPSILON</code> is smaller than 1ulp for these numbers:</p> <pre><code>double x = 3000000.3157; double y = x + DBL_EPSILON; </code></pre> <p>To be sure, I printed the hexadecimal representation of both <code>x</code> and <code>y</code> and got the following:</p> <pre><code>4146E3602868DB8C 4146E3602868DB8C </code></pre> <p>When I run the example at the bottom of your question through a couple different versions of G++ (4.4.5 and 4.8.0) with optimization both on (-O3) and off (no flags), I get the following output:</p> <pre><code>false false true false false true 0 0 </code></pre> <p>I suspect the behavior you're seeing is for precisely the reason you postulate: Your compiler is carrying greater precision for intermediate results and that's bleeding through to these comparisons.</p> <p>What version of the compiler are you using, and does other code in the application adjust any of the rounding modes? What compile flags are you using?</p> <hr> <p><strong>EDIT 1</strong></p> <p>I was able to reproduce your behavior by recompiling with optimization <em>off</em> and in 32-bit mode. In that mode, I see that the compiler leaves the result of <code>foo</code> on the floating point stack:</p> <pre><code>_Z3food: .LFB1053: .cfi_startproc pushl %ebp # .cfi_def_cfa_offset 8 .cfi_offset 5, -8 movl %esp, %ebp #, .cfi_def_cfa_register 5 subl $8, %esp #, movl 8(%ebp), %eax # x, tmp61 movl %eax, -8(%ebp) # tmp61, x movl 12(%ebp), %eax # x, tmp62 movl %eax, -4(%ebp) # tmp62, x fldl -8(%ebp) # x fldl .LC0 # fmulp %st, %st(1) #, leave </code></pre> <p>That suggests that this is a quirk of the i386 ABI. To test this theory, I looked at the i386 ABI more closely. On <a href="http://www.sco.com/developers/devspecs/abi386-4.pdf">page 38 of this PDF</a> (aka. "page 3-12" by the internal page numbers), I find what is likely the smoking gun:</p> <blockquote> <p><code>%st(0)</code> <em>Floating-point return values</em> appear on the top of the floating- point register stack; there is no difference in the representation of single- or double-precision values in floating-point registers. If the function does not return a floating-point value, then this register must be empty. This register must be empty before G entry to a function.</p> </blockquote> <p>It goes on to say a few paragraphs later:</p> <blockquote> <p>A floating-point return value appears on the top of the Intel387 register stack. The caller then must remove the value from the Intel387 stack, even if it doesn’t use the value. Failure of either side to meet its obligations leads to undefined program behavior. The standard calling sequence does not include any method to detect such failures nor to detect return value type mismatches. Therefore the user must declare all functions properly. <strong>There is no difference in the representation of single-, double- or extended-precision values in floating-point registers.</strong></p> </blockquote> <p>Searching further down to pages 3-27 (PDF page 53) and 3-28 (PDF page 54) give the following confusing twists. The table in Figure 3-30 suggests that the initial rounding mode is "53-bit (double precision)", and that that's the mode at process initialization.</p> <p>It goes on further to give the following warning on the next page:</p> <blockquote> <p>The initial floating-point state should be changed with care. In particular, many floating-point routines may produce undefined behavior if the precision control is set to less than 53 bits. The _fpstart routine (see Chapter 6) <strong>changes the precision control to 64 bits</strong> and sets all exceptions to be asked. This is the default state required for conformance to the ANSI C standard and to the IEEE 754 Floating-point standard.</p> </blockquote> <p>A <a href="http://www.vinc17.org/research/extended.en.html">couple</a> <a href="http://www.network-theory.co.uk/docs/gccintro/gccintro_70.html">reference</a>s on the net indicate Linux does indeed set the x87 to extended precision (at least in the 32-bit ABI).</p> <hr> <p><strong>EDIT 2</strong></p> <p>It appears extended precision is indeed the culprit. I added the following code to the test case, <a href="http://www.network-theory.co.uk/docs/gccintro/gccintro_70.html">as suggested by this page</a>:</p> <pre><code>void set_fpu (unsigned int mode) { asm ("fldcw %0" : : "m" (*&mode)); } // ... set_fpu(0x27F); </code></pre> <p>With those lines added, the test case returns the same values I saw with the 64-bit ABI.</p> <p>So, assuming you're compiling a 32-bit program under Linux, this appears to be the reason you're seeing strange comparison and sorting results.</p> <p>Can you re-run your sorting and searching code with the FPU set to 53-bit precision as I did above, and see if that resolves the differences you saw between your two lambda expressions?</p>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload