Note that there are some explanatory texts on larger screens.

plurals
  1. POLogLog and HyperLogLog algorithms for counting of large cardinalities
    primarykey
    data
    text
    <p>Where can I find a valid implementation of <a href="http://algo.inria.fr/flajolet/Publications/DuFl03.pdf" rel="nofollow noreferrer">LogLog algorithm</a>? Have tried to implement it by myself but my draft implementation yields strange results.</p> <p><a href="http://jsfiddle.net/LCDLL/1/" rel="nofollow noreferrer">Here</a> it is:</p> <pre><code>function LogLog(max_error, max_count) { function log2(x) { return Math.log(x) / Math.LN2; } var m = 1.30 / max_error; var k = Math.ceil(log2(m * m)); m = Math.pow(2, k); var k_comp = 32 - k; var l = log2(log2(max_count / m)); if (isNaN(l)) l = 1; else l = Math.ceil(l); var l_mask = ((1 &lt;&lt; l) - 1) &gt;&gt;&gt; 0; var M = []; for (var i = 0; i &lt; m; ++i) M[i] = 0; function count(hash) { if (hash !== undefined) { var j = hash &gt;&gt;&gt; k_comp; var rank = 0; for (var i = 0; i &lt; k_comp; ++i) { if ((hash &gt;&gt;&gt; i) &amp; 1) { rank = i + 1; break; } } M[j] = Math.max(M[j], rank &amp; l_mask); } else { var c = 0; for (var i = 0; i &lt; m; ++i) c += M[i]; return 0.79402 * m * Math.pow(2, c / m); } } return {count: count}; } function fnv1a(text) { var hash = 2166136261; for (var i = 0; i &lt; text.length; ++i) { hash ^= text.charCodeAt(i); hash += (hash &lt;&lt; 1) + (hash &lt;&lt; 4) + (hash &lt;&lt; 7) + (hash &lt;&lt; 8) + (hash &lt;&lt; 24); } return hash &gt;&gt;&gt; 0; } var words = ['aardvark', 'abyssinian', ... ,'zoology']; // about 2 300 words var log_log = LogLog(0.01, 100000); for (var i = 0; i &lt; words.length; ++i) log_log.count(fnv1a(words[i])); alert(log_log.count()); </code></pre> <p>For unknown reason implementation is very sensitive to <code>max_error</code> parameter, it is the main factor that determines the magnitude of the result. I'm sure, there is some stupid mistake :)</p> <p><strong>UPDATE:</strong> This problem is solved in the <a href="http://algo.inria.fr/flajolet/Publications/FlFuGaMe07.pdf" rel="nofollow noreferrer">newer version</a> of algorithm. I will post its implementation later.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload