Note that there are some explanatory texts on larger screens.

plurals
  1. POHow can I vectorize this python count sort so it is absolutely as fast as it can be?
    text
    copied!<p>I am trying to write a count sort in python to beat the built-in timsort in certain situations. Right now it beats the built in sorted function, but only for very large arrays (1 million integers in length and longer, I haven't tried over 10 million) and only for a range no larger than 10,000. Additionally, the victory is narrow, with count sort only winning by a significant margin in random lists specifically tailored to it.</p> <p>I have read about astounding performance gains that can be gained from vectorizing python code, but I don't particularly understand how to do it or how it could be used here. I would like to know how I can vectorize this code to speed it up, and any other performance suggestions are welcome.</p> <p>Current fastest version for just python and stdlibs:</p> <pre><code>from itertools import chain, repeat def untimed_countsort(unsorted_list): counts = {} for num in unsorted_list: try: counts[num] += 1 except KeyError: counts[num] = 1 sorted_list = list( chain.from_iterable( repeat(num, counts[num]) for num in xrange(min(counts), max(counts) + 1))) return sorted_list </code></pre> <ul> <li><p>All that counts is raw speed here, so sacrificing even more space for speed gains is completely fair game.</p></li> <li><p>I realize the code is fairly short and clear already, so I don't know how much room there is for improvement in speed.</p></li> <li><p>If anyone has a change to the code to make it shorter, as long as it doesn't make it slower, that would be awesome as well.</p></li> <li><p>Execution time is down almost 80%! Now three times as fast as Timsort on my current tests!</p></li> </ul> <p>The absolute fastest way to do this by a LONG shot is using this one-liner with numpy:</p> <pre><code>def np_sort(unsorted_np_array): return numpy.repeat(numpy.arange(1+unsorted_np_array.max()), numpy.bincount(unsorted_np_array)) </code></pre> <p>This runs about 10-15 times faster than the pure python version, and about 40 times faster than Timsort. It takes a numpy array in and outputs a numpy array.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload