Note that there are some explanatory texts on larger screens.

plurals
  1. POMost efficient sorting algorithm for many identical keys?
    text
    copied!<p>What is the most efficient algorithm for grouping identical items together in an array, given the following:</p> <ol> <li>Almost all items are duplicated several times.</li> <li>The items are not necessarily integers or anything else that's similarly simple. The range of the keys is not even well-defined, let alone small. In fact, the keys can be arbitrary structs. This rules out the most simple forms of counting sort.</li> <li>We care about both asymptotic and non-asymptotic properties, and n may be small sometimes. However, when n is small, performance is still important because this function may be called several million times in a loop on millions of small datasets. This rules out any expensive hash function or using a complex data structure that needs to perform lots of memory allocations.</li> <li>The data may be sorted in arbitrary order as long as all identical items are grouped together.</li> </ol> <p>If this is confusing, here's an example, assuming such a function is named groupIdentical:</p> <pre><code>uint[] foo = [1,2,3,2,1,5,4,5]; uint[] bar = groupIdentical(foo); // One possibile correct value for bar: // bar == [2,2,1,1,3,4,5,5]. // Another possible correct answer: // bar == [1,1,2,2,5,5,4,3]. </code></pre> <p>However, as a reminder, we cannot assume that the data is composed as integers.</p> <p>Edit: Thank you for the answers. My main problem with hashing was that hash tables perform memory allocations to frequently. What I ended up doing was writing my own hash table that uses a region allocator that I had around to get around this problem. Works well.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload