Note that there are some explanatory texts on larger screens.

plurals
  1. POTuning Mathematical Parallel Codes
    text
    copied!<p>Assuming that I am interested in performance rather than portability of my linear algebra iterative multi-threaded solver and that I have the results of profiling my code in hand, how do I go about tuning my code to run optimally on that machine of my choice? </p> <p>The algorithm involves Matrix-Vector multiplications, norms and dot-products. (FWIW, I am working on CG and GMRES). </p> <p>I am working on codes which are of matrix size roughly equivalent to the full size of the RAM (~6GB). I'll be working on Intel i3 Laptop. I'll be linking my codes using Intel MKL.</p> <p>Specifically,</p> <ul> <li><p><strong>Is there a good resource(PDF/Book/Paper) for learning manual tuning?</strong> There are numerous things that I learnt by doing for instance : Manual Unrolling isn't always optimal or about compiler flags but I would prefer a centralized resource. </p></li> <li><p>I need something <strong>to translate profiler information to improved performance.</strong> For instance, my profiler tells me that my stacks of one processor are being accessed by another or that my <code>mulpd</code> ASM is taking too much time. I have no clue what these mean and how I could use this information for improving my code.</p></li> </ul> <p><em>My intention is to spend as much time as needed to squeeze as much compute power as possible. Its more of a learning experience than for actual use or distribution as of now.</em></p> <p>(I am concerned about manual tuning not auto-tuning)</p> <p>Misc Details:</p> <ul> <li>This differs from <em>usual</em> performance tuning since the major portions of the code are linked to Intel's proprietary MKL library.</li> <li>Because of Memory Bandwidth issues in O(N^2) matrix-vector multiplications and dependencies, there is a limit to what I could manage on my own through <em>simple</em> observation.</li> <li>I write in C and Fortran and I have tried both and as discussed a million times on SO, I found no difference in either if I tweak them appropriately. </li> </ul>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload