Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>The most likely cause of the speed improvement is that:</p> <ul> <li>inserting a MOV shifts the subsequent instructions to different memory addresses</li> <li>one of those moved instructions was an important conditional branch</li> <li>that branch was being incorrectly predicted due to aliasing in the branch prediction table</li> <li>moving the branch eliminated the alias and allowed the branch to be predicted correctly</li> </ul> <p>Your Core2 doesn't keep a separate history record for each conditional jump. Instead it keeps a shared history of all conditional jumps. One disadvantage of <a href="http://en.wikipedia.org/wiki/Branch_predictor#Global_branch_prediction" rel="noreferrer">global branch prediction</a> is that the history is diluted by irrelevant information if the different conditional jumps are uncorrelated.</p> <p>This little <a href="http://www.ece.unm.edu/~jimp/611/slides/chap4_5.html" rel="noreferrer">branch prediction tutorial</a> shows how branch prediction buffers work. The cache buffer is indexed by the lower portion of the address of the branch instruction. This works well unless two important uncorrelated branches share the same lower bits. In that case, you end-up with aliasing which causes many mispredicted branches (which stalls the instruction pipeline and slowing your program).</p> <p>If you want to understand how branch mispredictions affect performance, take a look at this excellent answer: <a href="https://stackoverflow.com/a/11227902/1001643">https://stackoverflow.com/a/11227902/1001643</a></p> <p>Compilers typically don't have enough information to know which branches will alias and whether those aliases will be significant. However, that information can be determined at runtime with tools such as <a href="http://valgrind.org/docs/manual/cg-manual.html" rel="noreferrer">Cachegrind</a> and <a href="http://software.intel.com/en-us/forums/topic/392268" rel="noreferrer">VTune</a>.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload