StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PONumpy vs Cython speed
primarykey
Id
7799977
data
AcceptedAnswerId
7802144
AnswerCount
5
ClosedDate
CommentCount
2
CommunityOwnedDate
CreationDate
2011-10-17T21:46:35.740
FavoriteCount
19
LastActivityDate
2018-04-03T06:46:47.897
LastEditDate
LastEditorUserId
0
OwnerUserId
482819
ParentId
0
PostTypeId
1
Score
33
ViewCount
20805
LastEditorDisplayName
text
Body
I have an analysis code that does some heavy numerical operations using numpy. Just for curiosity, tried to compile it with cython with little changes and then I rewrote it using loops for the numpy part. To my surprise, the code based on loops was much faster (8x). I cannot post the complete code, but I put together a very simple unrelated computation that shows similar behavior (albeit the timing difference is not so big): Version 1 (without cython) <pre><code>import numpy as np def _process(array): rows = array.shape[0] cols = array.shape[1] out = np.zeros((rows, cols)) for row in range(0, rows): out[row, :] = np.sum(array - array[row, :], axis=0) return out def main(): data = np.load('data.npy') out = _process(data) np.save('vianumpy.npy', out) </code></pre> Version 2 (building a module with cython) <pre><code>import cython cimport cython import numpy as np cimport numpy as np DTYPE = np.float64 ctypedef np.float64_t DTYPE_t @cython.boundscheck(False) @cython.wraparound(False) @cython.nonecheck(False) cdef _process(np.ndarray[DTYPE_t, ndim=2] array): cdef unsigned int rows = array.shape[0] cdef unsigned int cols = array.shape[1] cdef unsigned int row cdef np.ndarray[DTYPE_t, ndim=2] out = np.zeros((rows, cols)) for row in range(0, rows): out[row, :] = np.sum(array - array[row, :], axis=0) return out def main(): cdef np.ndarray[DTYPE_t, ndim=2] data cdef np.ndarray[DTYPE_t, ndim=2] out data = np.load('data.npy') out = _process(data) np.save('viacynpy.npy', out) </code></pre> Version 3 (building a module with cython) <pre><code>import cython cimport cython import numpy as np cimport numpy as np DTYPE = np.float64 ctypedef np.float64_t DTYPE_t @cython.boundscheck(False) @cython.wraparound(False) @cython.nonecheck(False) cdef _process(np.ndarray[DTYPE_t, ndim=2] array): cdef unsigned int rows = array.shape[0] cdef unsigned int cols = array.shape[1] cdef unsigned int row cdef np.ndarray[DTYPE_t, ndim=2] out = np.zeros((rows, cols)) for row in range(0, rows): for col in range(0, cols): for row2 in range(0, rows): out[row, col] += array[row2, col] - array[row, col] return out def main(): cdef np.ndarray[DTYPE_t, ndim=2] data cdef np.ndarray[DTYPE_t, ndim=2] out data = np.load('data.npy') out = _process(data) np.save('vialoop.npy', out) </code></pre> With a 10000x10 matrix saved in data.npy, the times are: <pre><code>$ python -m timeit -c "from version1 import main;main()" 10 loops, best of 3: 4.56 sec per loop $ python -m timeit -c "from version2 import main;main()" 10 loops, best of 3: 4.57 sec per loop $ python -m timeit -c "from version3 import main;main()" 10 loops, best of 3: 2.96 sec per loop </code></pre> Is this expected or is there an optimization that I am missing? The fact that version 1 and 2 gives the same result is somehow expected, but why version 3 is faster? Ps.- This is NOT the calculation that I need to make, just a simple example that shows the same thing.
Tags
<python><performance><numpy><cython>
Title
Numpy vs Cython speed
singulars
PostAcceptedAnswerId
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
PostParentId
1. This table or related slice is empty.
PostTypePostTypeId
1. PTQuestion
UserLastEditorUserId
1. This table or related slice is empty.
UserOwnerUserId
1. USHernan
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. PL
 singulars
 LinkTypeLinkTypeId
 LTLinked
2. PL
 singulars
 LinkTypeLinkTypeId
 LTLinked
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
2. PO
 singulars
 PostTypePostTypeId
 PTAnswer
3. PO
 singulars
 PostTypePostTypeId
 PTAnswer
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 PONumpy vs Cython speed
 UserUserId
 USunutbu
 VoteTypeVoteTypeId
 VTFavorite
2. VO
 singulars
 PostPostId
 PONumpy vs Cython speed
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
3. VO
 singulars
 PostPostId
 PONumpy vs Cython speed
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
CommentsPostId
1. CO"but why version 3 is faster?" Seems rhetorical. You expanded a function "inline" by rewriting it. You've saved some overhead. What are you asking?
 singulars
 PostPostId
 PONumpy vs Cython speed
 UserUserId
 USS.Lott
2. COThis code can be made much faster using matrix multiplication: `out = (rows*eye((rows,cols))-ones((rows,cols))*data`.
 singulars
 PostPostId
 PONumpy vs Cython speed
 UserUserId
 UScyborg

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.