Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>Firstly, processors have a capability called <b><a href="http://en.wikipedia.org/wiki/Branch_predictor" rel="nofollow noreferrer">branch prediction</a></b>. After a few runs of the loop, the processor will be able to notice that your <code>if</code> statement always goes one way. (It can even notice regular patterns, like <code>true false true false</code>.) It will then <b><a href="http://en.wikipedia.org/wiki/Speculative_execution" rel="nofollow noreferrer">speculatively execute</a></b> that branch, and so long as it able to predict correctly, the extra cost of the <code>if</code> statement is pretty much eliminated. If you think that the user is more likely to choose <code>true</code> rather than <code>false</code>, you can even <a href="http://kerneltrap.org/node/4705" rel="nofollow noreferrer">tell this to the gcc compiler</a> (gcc-specific extension).</p> <p>However, you did mention in one of your comments that you have a 'much more complicated sequence of bools'. I think it is possible that the processor doesn't have the memory to pattern-match all those jumps -- by the time it comes back to the first <code>if</code> statement, the knowledge of which way that jump went has been displaced from its memory. But we could help it here...</p> <p>The compiler has the ability to <b>transform loops and if-statements</b> into what it thinks are more optimal forms. E.g. it could possibly transform your code into the form given by schnaader. This is known as <a href="http://en.wikipedia.org/wiki/Loop_unswitching" rel="nofollow noreferrer">loop unswitching</a>. You can help it along by doing <b><a href="http://en.wikipedia.org/wiki/Profile-guided_optimization" rel="nofollow noreferrer">Profile-Guided Optimization (PGO)</a></b>, letting the compiler know where the hotspots are. (Note: In gcc, <code>-funswitch-loops</code> is only turned on at <code>-O3</code>.)</p> <p>You should <b>profile</b> your code at the instruction level (<a href="http://software.intel.com/en-us/intel-vtune/" rel="nofollow noreferrer">VTune</a> would be a good tool for this) to see if the if-statements are really the bottleneck. If they really are, and if by looking at the generated assembly you think the compiler has got it wrong despite PGO, you can try hoisting out the if-statement yourself. Perhaps templated code would make it more convenient:</p> <pre><code>template&lt;bool B&gt; void innerLoop() { for (int i=0; i&lt;10000; i++) { if (B) { // some stuff.. } else { // some other stuff.. } } } if (user_set_flag) innerLoop&lt;true&gt;(); else innerLoop&lt;false&gt;(); </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload