Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>What is fastest will depend entirely on the target architecture. It looks here like you're interested only in the platform you happen to be on, which guessing from your execution times seems to be 64-bit x86, either Intel (Core2?) or AMD.</p> <p>That said, floating-point multiplication by the inverse will be the fastest on many platforms, but is, as you speculate, usually less accurate than a floating-point divide (two roundings instead of one -- whether or not that matters for your usage is a separate question). In general, you are better off re-arranging your algorithm to use fewer divides than you are jumping through hoops to make division as efficient as possible (the fastest division is the one you don't do), and make sure to benchmark before you spend time optimizing at all, as algorithms that bottleneck on division are few and far between.</p> <p>Also, if you have integer sources and need an integer result, make sure to include the cost of conversion between integer and floating-point in your benchmarking.</p> <p>Since you're interested in timings on a specific machine, you should be aware that Intel now publishes this information in their <a href="http://www.intel.com/Assets/PDF/manual/248966.pdf" rel="nofollow noreferrer">Optimization Reference Manual (pdf)</a>. Specifically, you will be interested in the tables of Appendix C section 3.1, "Latency and Throughput with Register Operands".</p> <p>Be aware that integer divide timings depend strongly on the actual values involved. Based on the information in that guide, it seems that your timing routines still have a fair bit of overhead, as the performance ratios you measure don't match up with Intel's published information.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload