Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>First note that the problem is at least as hard as the <a href="http://en.wikipedia.org/wiki/Element_distinctness_problem" rel="nofollow">element distinctness problem</a>.</p> <p>Thus, the solutions should follow the same approaches:</p> <ol> <li>sort (using <a href="http://en.wikipedia.org/wiki/External_sorting" rel="nofollow">external sort</a>) and iterate while counting occurances for each number and looking for the maximal.</li> <li>Hashing solution: hash the numbers into buckets that fit in memory (note that all occurances of the same number will be hashed to the same bucket), for each bucket - find the most frequent number, and store it. Then go through all candidates from all buckets and chose the best. <br>In here, you can either sort (in memory) each bucket and find the most frequent number or you can create a <a href="http://en.wikipedia.org/wiki/Histogram" rel="nofollow">histogram</a> (using a hash map, with a different hash function) to find the frequency of each item in the bucket. <br>Note that the buckets are written on disk, and loaded into memory one after the other, at each time only a small part of the data is stored on RAM.</li> </ol> <p>Another more scalable approach could be using <a href="http://en.wikipedia.org/wiki/MapReduce" rel="nofollow">map-reduce</a>, with a simple map-reduce step to count number of occurances per number, and then just find maximum of those:</p> <pre><code>map(number): emit(number,'1') reduce(number,list): emit (number, size(list)) </code></pre> <p>all is left is to find the number with the highest value - which can be done in linear scan.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload