Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <h1>POSIX Clocks</h1> <p>I wrote a benchmark for POSIX clock sources:</p> <ul> <li>time (s) => 3 cycles</li> <li>ftime (ms) => 54 cycles</li> <li>gettimeofday (us) => 42 cycles</li> <li>clock_gettime (ns) => 9 cycles (CLOCK_MONOTONIC_COARSE)</li> <li>clock_gettime (ns) => 9 cycles (CLOCK_REALTIME_COARSE)</li> <li>clock_gettime (ns) => 42 cycles (CLOCK_MONOTONIC)</li> <li>clock_gettime (ns) => 42 cycles (CLOCK_REALTIME)</li> <li>clock_gettime (ns) => 173 cycles (CLOCK_MONOTONIC_RAW)</li> <li>clock_gettime (ns) => 179 cycles (CLOCK_BOOTTIME)</li> <li>clock_gettime (ns) => 349 cycles (CLOCK_THREAD_CPUTIME_ID)</li> <li>clock_gettime (ns) => 370 cycles (CLOCK_PROCESS_CPUTIME_ID)</li> <li>rdtsc (cycles) => 24 cycles</li> </ul> <p>These numbers are from an Intel Core i7-4771 CPU @ 3.50GHz on Linux 4.0. These measurements were taken using the TSC register and running each clock method thousands of times and taking the minimum cost value.</p> <p>You'll want to test on the machines you intend to run on though as how these are implemented varies from hardware and kernel version. The code can be found <a href="https://github.com/dterei/Scraps/blob/master/intel_tsc/eval_clocks.c">here</a>. It relies on the TSC register for cycle counting, which is in the same repo (<a href="https://github.com/dterei/Scraps/blob/master/intel_tsc/tsc.h">tsc.h</a>).</p> <h1>TSC</h1> <p>Access the TSC (processor time-stamp counter) is the most accurate and cheapest way to time things. Generally, this is what the kernel is using itself. It's also quite straight-forward on modern Intel chips as the TSC is synchronized across cores and unaffected by frequency scaling. So it provides a simple, global time source. You can see an example of using it <a href="https://github.com/dterei/Scraps/blob/master/intel_tsc/tsc.h">here</a> with a walkthrough of the assembly code <a href="https://github.com/dterei/Scraps/blob/master/intel_tsc/using_tsc.c">here</a>.</p> <p>The main issue with this (other than portability) is that there doesn't seem to be a good way to go from cycles to nanoseconds. The Intel docs as far as I can find state that the TSC runs at a fixed frequency, but that this frequency may differ from the processors stated frequency. Intel doesn't appear to provide a reliable way to figure out the TSC frequency. The Linux kernel appears to solve this by testing how many TSC cycles occur between two hardware timers (see <a href="http://lxr.free-electrons.com/source/arch/x86/kernel/tsc.c?v=2.6.31#L399">here</a>).</p> <h1>Memcached</h1> <p>Memcached bothers to do the cache method. It may simply be to make sure the performance is more predictable across platforms, or scale better with multiple cores. It may also no be a worthwhile optimization.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload