Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>I noticed some things. If I compile (GCC 4.5.3) your assembler listing <code>-O3 -S</code> with <code>g++ logflt.S -lrt</code> I can reproduce the behavior. My timings are:</p> <pre><code>ln=6:984160044 lb=6:950842852 lg=3:64288522 </code></pre> <p>Then I examined the output with <code>objdump -SC a.out</code>. I prefer this to looking into the <code>.S</code> files since there are constructs which I do not (yet) understand. The code is not very easy to read, but I find the following:</p> <p>Before calling <code>log</code> or <code>log2</code> the argument is converted using</p> <pre><code>400900: f2 0f 2a c3 cvtsi2sd %ebx,%xmm0 400904: 66 0f 57 c9 xorpd %xmm1,%xmm1 400908: f2 0f 59 05 60 04 00 mulsd 0x460(%rip),%xmm0 40090f: 00 400910: 66 0f 2e c8 ucomisd %xmm0,%xmm1 </code></pre> <p><code>0x460(%rip)</code> is a relative adress which is pointing to the hex-value <code>0000 00000000 33333333 33332440</code>. This is a 16-byte SSE <code>double</code> pair from which only one double is important (code is using scalar SSE). This double is <code>10.1</code>. <code>mulsd</code> thus performs the multiplication in the C++ line <code>m = n * 10.1;</code>.</p> <p><code>log10</code> is different:</p> <pre><code>400a40: f2 0f 2a c3 cvtsi2sd %ebx,%xmm0 400a44: 66 0f 57 c9 xorpd %xmm1,%xmm1 400a48: 66 0f 2e c8 ucomisd %xmm0,%xmm1 </code></pre> <p><strong>I think for the case of <code>log10</code> you forgot to perform the multiplication!</strong> So you are just calling the <code>log10</code> with the same value again and again ... I would not surprise me if the cpu is clever enough to optimize that.</p> <p>EDIT: I am now very sure this is the problem, because in your other listing (<code>-O0 -S</code>) multiplication is correctly performed - so <strong>please post your code</strong> and let others prove me wrong!</p> <p>EDIT2: One way GCC <em>could</em> get rid of this multiplication is by using the following identity:</p> <pre><code>log(n * 10.1) = log(n) + log(10.1) </code></pre> <p>But in that case <code>log(10.1)</code> would have to be computed once and I do not see this the code. I also doubt that GCC would do that for <code>log10</code> but not for <code>log</code> and <code>log2</code>.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload