Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>Sorting millions of entries might sound like a lot of work, but it clearly isn't. Sorting 10^6 completely random entries takes about 3 seconds on my computer (just an older EeePC with an Atom CPU (first generation i think), 1.6GHz).</p> <p>And with a good sorting algorithm, sorting has O(n*log(n)) in the worst case, so it wont really matter if you have 10^9 or more entries. And most of the time the rank list will be already nearly sorted (from a previous ranking) resulting in a runtime which is more likely to be O(n).</p> <p>So, stop worrying about it! The only real problem is, that most DBMSs can not directly access the 1000th entry. So, a query like <code>SELECT ... LIMIT 1000, 5</code> will have to query at least 1005 entries and skip the first 1000. But the solution here is simply too. Just store the <code>rank</code> as an redundant column of each row, add an index to it and compute it every 15min (or every 5min, 30min, 1h, or whatever makes sense for your application). With that, all queries by rank are just simply secondary index lookups (about O(log(N))) which is extremely fast and will only take some milliseconds per query (the network is here the bottleneck, not the database).</p> <p>PS: You commented on another answer that you can not cache the sorted entries because they are too large for your memory. Assuming that you just cache (user_id, rank) tuples with two 64 bit integers (32 bits would be more than enough too!), you would need less than 8 MB of memory to store 10^6 entries. Are you sure you do not have enough RAM for that?</p> <p>So, please do not try to optimize something which is clearly not a bottleneck (yet)...</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload