Note that there are some explanatory texts on larger screens.

plurals
  1. USEd Smith
    primarykey
    data
    text
    plurals
    1. This table or related slice is empty.
    1. COAmazing, thank you @ScottD! For my timing tests using fully optimized ifort on linux I compared heaviside_c1 = 23.737394; heaviside_a1=16.656475; heaviside_a2=16.74144 and, for reference, no function call at all = 8.664681. Note, for intel fortran, there is a required change to the interface above with the addition of the line _italic_!DEC$ ATTRIBUTES STDCALL :: heaviside_a1 (see [link](http://software.intel.com/sites/products/documentation/hpc/composerxe/en-us/2011Update/fortran/lin/bldaps_for/win/bldaps_fortmasmov.htm)).
      singulars
    2. CO,thanks for the detailed response -- after looking at my ifort optimized assembly code, I'm not convinced the functions are implemented to anything like two operations (typically 6). I've benchmarked your three cases using ifort and gfortran. ifort: H1=25.166176; H2=24.070339; H3=24.727245; gfortran: H1=28.467665; H2=28.508667; H3=28.370693. It appears there is some scope for improvement. I'm trying to implement the suggested "cmpsd" + "andpd" using intel -S and replacing, although my assembly language is pretty rusty and the intel x86_64 assembly documentation is vast and unclear.
      singulars
    3. COThank you for the comments, strangely sign(0.5,x)+1 is slower using an optimized Intel compiler. Fortran bit testing operations to identify the sign bit are also slow. Is assuming a sign bit reasonable on modern computers? If it is, I could look into coding this check at the assembly level to maximize efficiency.
      singulars
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload