Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>If you're interested in this kind of thing, check out Agner Fog's excellent <a href="http://www.agner.org/optimize/" rel="noreferrer">Software Optimization Manuals</a>. This question is tangentially addressed in the first of the five, <a href="http://www.agner.org/optimize/optimizing_cpp.pdf" rel="noreferrer">Optimizing C++ (pdf)</a> (the others are all about assembly - he's kind of old-school).</p> <p>If <code>f()</code> is a <code>const</code> function, or its return value when called on <code>p</code> is otherwise guaranteed to be unchanged, it can be pulled out of the loop and only calculated once (see "Loop Invariant Code Motion", page 70). Most compilers will do this (see "Comparison of Different Compilers", page 74).</p> <p>If that can't be done, then it might still be possible to devirtualize. But this can't be done in a callable function, because that would <em>have</em> to use a virtual lookup for the sake of correctness. But if the function was inlined, and the type of <code>p</code> was known in the calling scope, it could be done. The calling code would have to look something like this:</p> <pre><code>A* aptr = new A(42); // &lt;- The compiler knows exactly what type aptr points to acc(a, 100); // &lt;- This would have to be inlined! </code></pre> <p>But according to that table (page 74), only the GCC compilers make this optimization.</p> <p>Finally, the closest optimization (I think) to what you're asking. Could the compiler perform the virtual lookup once, store a function pointer, and then use that function pointer to avoid the virtual lookup inside the loop? I don't see why not. But I don't know if any compilers do so - it's an obscure enough optimization that it's not even mentioned in Agner Fog's compulsively detailed C++ manual.</p> <p>For what it's worth, here's what he has to say about function pointers (page 38):</p> <blockquote> <p>Calling a function through a function pointer typically takes a few clock cycles more than calling the function directly if the target address can be predicted. The target address is predicted if the value of the function pointer is the same as last time the statement was executed. If the value of the function pointer has changed then the target address is likely to be mispredicted, which causes a long delay. See page 44 about branch prediction. A Pentium M processor may be able to predict the target if the changes of the function pointer follows a simple regular pattern, while Pentium 4 and AMD processors are sure to make a misprediction every time the function pointer has changed.</p> </blockquote> <p>And an excerpt about virtual member functions (page 54):</p> <blockquote> <p>The time it takes to call a virtual member function is a few clock cycles more than it takes to call a non-virtual member function, provided that the function call statement always calls the same version of the virtual function. If the version changes then you may get a misprediction penalty of 10 - 20 clock cycles. The rules for prediction and misprediction of virtual function calls is the same as for switch statements, as explained on page 45.</p> <p>The dispatching mechanism can be bypassed when the virtual function is called on an object of known type, but you cannot always rely on the compiler bypassing the dispatch mechanism even when it would be obvious to do so. See page 73.</p> </blockquote> <p>You know the function pointer wouldn't change in your example, so you wouldn't get the misprediction penalty, but he never compares function pointer performance to virtual function performance directly. Both just take "a few" more clock cycles than a regular function call. Maybe it's the same mechanism - if so, that "optimization" would just be adding an extra lookup.</p> <p>So it's hard to say, really. The best way to get an answer might just be to have your favourite compiler spit out some optimized assembly and dig through it (unpleasant, but conclusive!).</p> <p>Hope this helps!</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload