Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>Use a <code>scipy.sparse</code> format that is row or column based: <code>csc_matrix</code> and <code>csr_matrix</code>. </p> <p>These use efficient, C implementations under the hood (including multiplication), and transposition is a no-op (esp. if you call <code>transpose(copy=False)</code>), just like with numpy arrays.</p> <p>EDIT: some timings via <a href="http://ipython.scipy.org/moin/">ipython</a>:</p> <pre><code>import numpy, scipy.sparse n = 100000 x = (numpy.random.rand(n) * 2).astype(int).astype(float) # 50% sparse vector x_csr = scipy.sparse.csr_matrix(x) x_dok = scipy.sparse.dok_matrix(x.reshape(x_csr.shape)) </code></pre> <p>Now <code>x_csr</code> and <code>x_dok</code> are 50% sparse:</p> <pre><code>print repr(x_csr) &lt;1x100000 sparse matrix of type '&lt;type 'numpy.float64'&gt;' with 49757 stored elements in Compressed Sparse Row format&gt; </code></pre> <p>And the timings:</p> <pre><code>timeit numpy.dot(x, x) 10000 loops, best of 3: 123 us per loop timeit x_dok * x_dok.T 1 loops, best of 3: 1.73 s per loop timeit x_csr.multiply(x_csr).sum() 1000 loops, best of 3: 1.64 ms per loop timeit x_csr * x_csr.T 100 loops, best of 3: 3.62 ms per loop </code></pre> <p>So it looks like I told a lie. Transposition <strong>is</strong> very cheap, but there is no efficient C implementation of csr * csc (in the latest scipy 0.9.0). A new csr object is constructed in each call :-(</p> <p>As a hack (though scipy is relatively stable these days), you can do the dot product directly on the sparse data:</p> <pre><code>timeit numpy.dot(x_csr.data, x_csr.data) 10000 loops, best of 3: 62.9 us per loop </code></pre> <p>Note this last approach does a numpy dense multiplication again. The sparsity is 50%, so it's actually faster than <code>dot(x, x)</code> by a factor of 2.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload