Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Given that you are writing each 8-byte double with 11-12 characters, then the overall memory you need should be around ~450MB, which means the number of items you have should be around 50,000,000.</p> <p>Sorting 50 million values shouldn't take long. What will take long is your <code>for</code> loop where you scan the whole file for every item.</p> <p>A more efficient way would be to sort the file, but keep the duplicate values. Then, all you need is one pass over the file, grouping similar values (or equal values, based on the precision of your histogram) and replace them with a value-count pair.</p> <p>For example, if you have the following file:</p> <pre><code>1 0.6 -2 0 -1 -0.6 0 0 3 </code></pre> <p>After sort you will get:</p> <pre><code>-2 -1 -0.6 0 0 0 0.6 1 3 </code></pre> <p>And if you follow this algorithm:</p> <pre><code>current_bucket = first value in file, floored to histogram_precision bucket_count = 0 for all values v ; write current bucket + additional empty buckets while v &gt; current_bucket + histogram_precision output current_bucket bucket_count current_bucket += histogram precision bucket_count = 0 ; add v to current_bucket bucket_count += 1 </code></pre> <p>given <code>histogram_precision</code> as 1 for example, you will get:</p> <pre><code>-2 1 -1 2 0 4 1 1 2 0 3 1 </code></pre> <p>where each line <code>num count</code> shows the number of values (<code>count</code>) in the range <code>[num, num+histogram_precision)</code>.</p> <p>You may want to use buckets like <code>[0.5, 1.5)</code> for example instead of <code>[1 2)</code>, in which case you should just tweak the first line where the initial bucket is computed, or alternatively change the condition of <code>while</code> loop to <code>v &gt; current_bucket + histogram_precision / 2</code>.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload