Note that there are some explanatory texts on larger screens.

plurals
  1. POReplacing extrordinarily slow pow() function
    primarykey
    data
    text
    <p>We have a CFD solver and while running a simulation, it was found to run extraordinarily slow on some machines but not others. Using Intel VTune, it was found the following line was the problem (in Fortran):</p> <pre><code>RHOV= RHO_INF*((1.0_wp - COEFF*EXP(F0)))**(1.0_wp/(GAMM - 1.0_wp)) </code></pre> <p>Drilling in with VTune, the problem was traced to the <code>call pow</code> assembly line and when tracing the stack, it showed it was using <code>__slowpow()</code>. After some searching, <a href="http://entropymine.com/imageworsener/slowpow/">this page</a> showed up complaining about the same thing. </p> <p>On the machine with libc version 2.12, the simulation took 18 seconds. On the machine with libc version 2.14, the simulation took 0 seconds. </p> <p>Based on the information on the aforementioned page, the problem arises when the base to <code>pow()</code> is close to 1.0. So we did another simple test where we scaled the base by an arbitrary number before the <code>pow()</code> and then divided by the number raised to the exponent after the <code>pow()</code> call. This dropped the runtime from 18 seconds to 0 seconds with the libc 2.12 also. </p> <p>However, it's impractical to put this all over the code where we do <code>a**b</code>. How would one go about replacing the <code>pow()</code> function in libc? For instance, I would like the assembly line <code>call pow</code> generated by the Fortran compiler to call a custom <code>pow()</code> function we write that does the scaling, calls the libc <code>pow()</code> and then divides by the scaling. How does one create an intermediate layer transparent to the compiler? </p> <p><strong>Edit</strong></p> <p>To clarify, we're looking for something like (pseudo-code):</p> <pre><code>double pow(a,b) { a *= 5.0 tmp = pow_from_libc(a,b) return tmp/pow_from_libc(5.0, b) } </code></pre> <p>Is it possible to load the <code>pow</code> from libc and rename it in our custom function to avoid the naming conflicts? If the <code>customPow.o</code> file could rename <code>pow</code> from libc, what happens if libc is still needed for other things? Would that cause a naming conflict between <code>pow</code> in <code>customPow.o</code> and <code>pow</code> in libc?</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload