Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>@Bernard:</p> <blockquote> <p>I have to admit, most of your example went straight over my head. It does compile, and seems to work, though. Is this safe for SMP systems or SpeedStep?</p> </blockquote> <p>That's a good question... I think the code's ok. From a practical standpoint, we use it in my company every day, and we run on a pretty wide array of boxes, everything from 2-8 cores. Of course, YMMV, etc, but it seems to be a reliable and low-overhead (because it doesn't make a context switch into system-space) method of timing.</p> <p>Generally how it works is:</p> <ul> <li>declare the block of code to be assembler (and volatile, so the optimizer will leave it alone).</li> <li>execute the CPUID instruction. In addition to getting some CPU information (which we don't do anything with) it synchronizes the CPU's execution buffer so that the timings aren't affected by out-of-order execution.</li> <li>execute the rdtsc (read timestamp) execution. This fetches the number of machine cycles executed since the processor was reset. This is a 64-bit value, so with current CPU speeds it will wrap around every 194 years or so. Interestingly, in the original Pentium reference, they note it wraps around every 5800 years or so.</li> <li>the last couple of lines store the values from the registers into the variables hi and lo, and put that into the 64-bit return value.</li> </ul> <p>Specific notes:</p> <ul> <li><p>out-of-order execution can cause incorrect results, so we execute the "cpuid" instruction which in addition to giving you some information about the cpu also synchronizes any out-of-order instruction execution.</p></li> <li><p>Most OS's synchronize the counters on the CPUs when they start, so the answer is good to within a couple of nano-seconds.</p></li> <li><p>The hibernating comment is probably true, but in practice you probably don't care about timings across hibernation boundaries.</p></li> <li><p>regarding speedstep: Newer Intel CPUs compensate for the speed changes and returns an adjusted count. I did a quick scan over some of the boxes on our network and found only one box that didn't have it: a Pentium 3 running some old database server. (these are linux boxes, so I checked with: grep constant_tsc /proc/cpuinfo)</p></li> <li><p>I'm not sure about the AMD CPUs, we're primarily an Intel shop, although I know some of our low-level systems gurus did an AMD evaluation.</p></li> </ul> <p>Hope this satisfies your curiosity, it's an interesting and (IMHO) under-studied area of programming. You know when Jeff and Joel were talking about whether or not a programmer should know C? I was shouting at them, "hey forget that high-level C stuff... assembler is what you should learn if you want to know what the computer is doing!"</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload