Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>I think you're seeing the difference, but it's just the function call overhead. Branch misprediction, memory access and the trig functions are the same in both cases. Compared to those, it's just not that big a deal, though the function pointer case was definitely a bit quicker when I tried it.</p> <p>If this is representative of your larger program, this is a good demonstration that this type of microoptimization is sometimes just a drop in the ocean, and at worst futile. But leaving that aside, for a clearer test, the functions should perform some simpler operation, that is different for each function:</p> <pre><code>void function_not( double *d ) { *d = 1.0; } void function_and( double *d ) { *d = 2.0; } </code></pre> <p>And so on, and similarly for the virtual functions.</p> <p>(Each function should do something different, so that they don't get elided and all end up with the same address; that would make the branch prediction work unrealistically well.)</p> <p>With these changes, the results are a bit different. Best of 4 runs in each case. (Not very scientific, but the numbers are broadly similar for larger numbers of runs.) All timings are in cycles, running on my laptop. Code was compiled with VC++ (only changed the timing) but gcc implements virtual function calls in the same way so the relative timings should be broadly similar even with different OS/x86 CPU/compiler.</p> <p>Function pointers: 2,052,770</p> <p>Virtuals: 3,598,039</p> <p>That difference seems a bit excessive! Sure enough, the two bits of code aren't quite the same in terms of their memory access behaviour. The second one should have a table of 4 A *s, used to fill in base, rather than new'ing up a new one for each entry. Both examples will then have similar behaviour (1 cache miss/N entries) when fetching the pointer to jump through. For example:</p> <pre><code>A *tbl[4] = { new A1, new A2, new A3, new A4 }; for ( long int i = 0; i &lt; 100000; ++i ) { array[i] = ( double )( rand() / 1000 ); base[i] = tbl[ rand() % 4 ]; } </code></pre> <p>With this in place, still using the simplified functions:</p> <p>Virtuals (as suggested here): 2,487,699</p> <p>So there's 20%, best case. Close enough?</p> <p>So perhaps your colleague was right to at least consider this, but I suspect that in any realistic program the call overhead won't be enough of a bottleneck to be worth jumping through hoops over. </p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload