Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>The question cannot easily be answered for the simple reason that the performance of one algorithm relative to another depends as much the on compiler / processor / data structure combination as on the algorithm itself, as you surely know</p> <p>Therefore your approach to try a couple of them seems good enough. And yes, quicksort should be pretty fast. If you haven't done so, you might want to try insertionsort which often performs better on small data sets. This said, just settle on a sorting algo that does the job fast enough. You will typically not get 10-times faster just be picking the "right" algo.</p> <p>To get substantial speed-ups, the better way frequently is to use more structure. Some ideas that worked for me in the past with large-scale problems:</p> <ul> <li><p>Can you efficiently pre-calculate while creating the voxels and store 28 instead of 27 floats?</p></li> <li><p>Is an approximate solution good enough? If so, just look at the median of, say 9 values, since "in general it can be expected that values are relatively close." Or you can replace it with the average as long as the values are relatively close.</p></li> <li><p>Do you really need the median for all billions of voxels? Maybe you have an easy test whether you need the median, and can then only calculate for the relevant sub-set. </p></li> <li><p>If nothing else helps: look at the asm code that the compiler generates. You might be able write asm code that is substantially faster (e.g. by doing all the calcs using registers).</p></li> </ul> <p><strong>Edit:</strong> For what it's worth, I have attached the (partial) insertionsort code mentioned in the comment below (totally untested). If <code>numbers[]</code> is an array of size <code>N</code>, and you want the smallest <code>P</code> floats sorted at the beginning of the array, call <code>partial_insertionsort&lt;N, P, float&gt;(numbers);</code>. Hence if you call <code>partial_insertionsort&lt;27, 13, float&gt;(numbers);</code>, <code>numbers[13]</code> will contain the median. To gain additional speed, you would have to unfold the while loop, too. As discussed above, to get really fast, you have to use your knowledge about the data (e.g. is the data already partially sorted? Do you know properties of the distribution of the data? I guess, you get the drift).</p> <pre><code>template &lt;long i&gt; class Tag{}; template&lt;long i, long N, long P, typename T&gt; inline void partial_insertionsort_for(T a[], Tag&lt;N&gt;, Tag&lt;i&gt;) { long j = i &lt;= P+1 ? i : P+1; // partial sort T temp = a[i]; a[i] = a[j]; // compiler should optimize this away where possible while(temp &lt; a[j - 1] &amp;&amp; j &gt; 0) { a[j] = a[j - 1]; j--;} a[j] = temp; partial_insertionsort_for&lt;i+1,N,P,T&gt;(a,Tag&lt;N&gt;(),Tag&lt;i+1&gt;());} template&lt;long i, long N, long P, typename T&gt; inline void partial_insertionsort_for(T a[], Tag&lt;N&gt;, Tag&lt;N&gt;){} template &lt;long N, long P, typename T&gt; inline void partial_insertionsort(T a[]) {partial_insertionsort_for&lt;0,N,P,T&gt;(a, Tag&lt;N&gt;(), Tag&lt;0&gt;());} </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload