Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Instead of iterating on the coordinates using Python (GaryBishop's answer), you can have numpy do the looping, which constitutes a substantial speed-up (timings below):</p> <pre><code>def sparse_mult(a, b, coords) : rows, cols = zip(*coords) rows, r_idx = np.unique(rows, return_inverse=True) cols, c_idx = np.unique(cols, return_inverse=True) C = np.dot(a[rows, :], b[:, cols]) return C[r_idx, c_idx] &gt;&gt;&gt; A = np.arange(12).reshape(3, 4) &gt;&gt;&gt; B = np.arange(15).reshape(3, 5) &gt;&gt;&gt; np.dot(A.T, B) array([[100, 112, 124, 136, 148], [115, 130, 145, 160, 175], [130, 148, 166, 184, 202], [145, 166, 187, 208, 229]]) &gt;&gt;&gt; sparse_mult(A.T, B, [(0, 0), (1, 2), (2, 4), (3, 3)]) array([100, 145, 202, 208]) </code></pre> <p><code>sparse_mult</code> returns a flattened array of the values at the coordinates you provide as a list of tuples. I am not very familiar with sparse matrix formats, so I don't know how to define CSC from the above data, but the following works:</p> <pre><code>&gt;&gt;&gt; coords = [(0, 0), (1, 2), (2, 4), (3, 3)] &gt;&gt;&gt; sparse.coo_matrix((sparse_mult(A.T, B, coords), zip(*coords))).tocsc() &lt;4x5 sparse matrix of type '&lt;type 'numpy.int32'&gt;' with 4 stored elements in Compressed Sparse Column format&gt; </code></pre> <p>This is a timing of various alternatives:</p> <pre><code>&gt;&gt;&gt; import timeit &gt;&gt;&gt; a = np.random.rand(2000, 3000) &gt;&gt;&gt; b = np.random.rand(3000, 5000) &gt;&gt;&gt; timeit.timeit('np.dot(a,b)[[0, 0, 1999, 1999], [0, 4999, 0, 4999]]', 'from __main__ import np, a, b', number=1) 5.848562187263569 &gt;&gt;&gt; timeit.timeit('sparse_mult(a, b, [(0, 0), (0, 4999), (1999, 0), (1999, 4999)])', 'from __main__ import np, a, b, sparse_mult', number=1) 0.0018596387374678613 &gt;&gt;&gt; np.dot(a,b)[[0, 0, 1999, 1999], [0, 4999, 0, 4999]] array([ 758.76351111, 750.32613815, 751.4614542 , 758.8989648 ]) &gt;&gt;&gt; sparse_mult(a, b, [(0, 0), (0, 4999), (1999, 0), (1999, 4999)]) array([ 758.76351111, 750.32613815, 751.4614542 , 758.8989648 ]) </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload