Note that there are some explanatory texts on larger screens.

plurals
  1. POHow to efficiently compare the sign of two floating-point values while handling negative zeros
    primarykey
    data
    text
    <p>Given two floating-point numbers, I'm looking for an <em>efficient</em> way to check if they have the same sign, <em>given that if any of the two values is zero (+0.0 or -0.0), they should be considered to have the same sign</em>.</p> <p>For instance, </p> <ul> <li>SameSign(1.0, 2.0) should return true</li> <li>SameSign(-1.0, -2.0) should return true</li> <li>SameSign(-1.0, 2.0) should return false</li> <li><b>SameSign(0.0, 1.0) should return true</b></li> <li><b>SameSign(0.0, -1.0) should return true</b></li> <li><b>SameSign(-0.0, 1.0) should return true</b></li> <li><b>SameSign(-0.0, -1.0) should return true</b></li> </ul> <p>A naive but correct implementation of <code>SameSign</code> in C++ would be:</p> <pre><code>bool SameSign(float a, float b) { if (fabs(a) == 0.0f || fabs(b) == 0.0f) return true; return (a &gt;= 0.0f) == (b &gt;= 0.0f); } </code></pre> <p>Assuming the IEEE floating-point model, here's a variant of <code>SameSign</code> that compiles to branchless code (at least with with Visual C++ 2008):</p> <pre><code>bool SameSign(float a, float b) { int ia = binary_cast&lt;int&gt;(a); int ib = binary_cast&lt;int&gt;(b); int az = (ia &amp; 0x7FFFFFFF) == 0; int bz = (ib &amp; 0x7FFFFFFF) == 0; int ab = (ia ^ ib) &gt;= 0; return (az | bz | ab) != 0; } </code></pre> <p>with <code>binary_cast</code> defined as follow:</p> <pre><code>template &lt;typename Target, typename Source&gt; inline Target binary_cast(Source s) { union { Source m_source; Target m_target; } u; u.m_source = s; return u.m_target; } </code></pre> <p>I'm looking for two things:</p> <ol> <li><p><b>A faster, more efficient implementation of <code>SameSign</code></b>, using bit tricks, FPU tricks or even SSE intrinsics.</p></li> <li><p><b>An efficient extension of <code>SameSign</code> to three values</b>.</p></li> </ol> <p>Edit:</p> <p>I've made some performance measurements on the three variants of <code>SameSign</code> (the two variants described in the original question, plus Stephen's one). Each function was run 200-400 times, on all consecutive pairs of values in an array of 101 floats filled at random with -1.0, -0.0, +0.0 and +1.0. Each measurement was repeated 2000 times and the minimum time was kept (to weed out all cache effects and system-induced slowdowns). The code was compiled with Visual C++ 2008 SP1 with maximum optimization and SSE2 code generation enabled. The measurements were done on a Core 2 Duo P8600 2.4 Ghz.</p> <p>Here are the timings, not counting the overhead of fetching input values from the array, calling the function and retrieving the result (which amount to 6-7 clockticks):</p> <ul> <li>Naive variant: 15 ticks</li> <li>Bit magic variant: 13 ticks</li> <li><b>Stephens's variant: 6 ticks</b></li> </ul>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload