Note that there are some explanatory texts on larger screens.

plurals
  1. POHow to accelerate matrix multiplications in Python?
    primarykey
    data
    text
    <p>I am developing a small neural network whose parameters need a lot of optimization, so a lot of processing time. I have profiled my script with <code>cProfile</code> and what takes 80% of the processor time is the NumPy <code>dot</code> function, the rest is matrix inversion with the function <code>numpy.linalg.solve</code>. My current version of numpy uses <code>blas</code>, or it is what it seems, since <code>numpy.core._dotblas.dot</code> appears as the function that takes 80% of the total time of processing.</p> <p>As it is the core of my neural network and as I have to run this a lot, any minor speed gain could save me a lot of time over the numerous repeated parameters optimizations.</p> <p>More precisions: the matrix multiplication is on matrices that have a shape of minimum 100*100 up to 500*500. I have a computer with 12 cores and use them so far to run different neural network parameters optimization in parallel, but maybe the matrix multiplication could be done in parallel?</p> <p>Thank you for your time!</p> <p>Answer:</p> <p>I spent few days testing and installing uninstalling libraries... Here is the result of what I tested: By default on my version of Ubuntu (12.04) and respository installed version of Numpy, the BLAS libraries are ATLAS libraries. I made some tests that reflect the improvement SPECIFICALLY on the computations I am interested in, so these results must not be interpreted as the final answer. These computations involve a matrix multiplication (dot product) in a 55000 iterations loop, with a 500*500 and 1000*1000 matrix. I use a HP Z800 workstation with a Xeon X5675 @ 3.07GHZ with 12 cores. All the results (percentage) are the comparison between the described condition and the reference which here is the packaged ATLAS library.</p> <ul> <li><code>Scipy.sparse module</code>: I don't know if I set it correctly but with a 10% sparseness, using this module becomes useful starting from 1500*1500 matrices with OpenBLAS and MKL. If you have suggestion about how to use them properly I am interested!</li> <li>With OpenBlas I get speed increase of 33% for 500*500 matrices but 160% for 1000*1000. But with OpenBLAS, the scipy.sparse module does not perform better but worse in fact.</li> <li>The big winner here is the MKL libraries. The acceleration goes up to 230% with 1000*1000 matrices from the original ATLAS libraries! For the 500*500 matrices, the acceleration is more modest (100%) but still very good. Furthermore with the compilation with OpenMP, matrix multiplications can run on my 12 processors and here it twice as fast than on one processor with MKL libraries. But it is a waste of processing power, it is much more efficient to use multiprocessing modules to run scripts/matrix-multiplications in parallel.</li> </ul>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload