StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
text
Body
copied!<p>Like many benchmarks, this really depends on the particulars of the situation. It's true that, by default, numpy creates arrays in C-contiguous (row-major) order, so, in the abstract, operations that scan over columns should be faster than those that scan over rows. However, the shape of the array, the performance of the ALU, and the underlying cache on the processor have a huge impact on the particulars.</p> <p>For instance, on my MacBook Pro, with a small integer or float array, the times are similar, but a small integer type is significantly slower than the float type:</p> <pre><code>>>> x = numpy.ones((100, 100), dtype=numpy.uint8) >>> %timeit x.sum(axis=0) 10000 loops, best of 3: 40.6 us per loop >>> %timeit x.sum(axis=1) 10000 loops, best of 3: 36.1 us per loop >>> x = numpy.ones((100, 100), dtype=numpy.float64) >>> %timeit x.sum(axis=0) 10000 loops, best of 3: 28.8 us per loop >>> %timeit x.sum(axis=1) 10000 loops, best of 3: 28.8 us per loop </code></pre> <p>With larger arrays the absolute differences become larger, but at least on my machine are still smaller for the larger datatype:</p> <pre><code>>>> x = numpy.ones((1000, 1000), dtype=numpy.uint8) >>> %timeit x.sum(axis=0) 100 loops, best of 3: 2.36 ms per loop >>> %timeit x.sum(axis=1) 1000 loops, best of 3: 1.9 ms per loop >>> x = numpy.ones((1000, 1000), dtype=numpy.float64) >>> %timeit x.sum(axis=0) 100 loops, best of 3: 2.04 ms per loop >>> %timeit x.sum(axis=1) 1000 loops, best of 3: 1.89 ms per loop </code></pre> <p>You can tell numpy to create a Fortran-contiguous (column-major) array using the <code>order='F'</code> keyword argument to <code>numpy.asarray</code>, <code>numpy.ones</code>, <code>numpy.zeros</code>, and the like, or by converting an existing array using <code>numpy.asfortranarray</code>. As expected, this ordering swaps the efficiency of the row or column operations:</p> <pre><code>in [10]: y = numpy.asfortranarray(x) in [11]: %timeit y.sum(axis=0) 1000 loops, best of 3: 1.89 ms per loop in [12]: %timeit y.sum(axis=1) 100 loops, best of 3: 2.01 ms per loop </code></pre>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload