Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>If exact accuracy isn't required, I have a fast approximation for you, that uses 260bytes of ram (you could halve that, but don't).</p> <pre><code>int ftbl[33]={0,1,1,2,2,4,5,8,11,16,22,32,45,64,90,128,181,256,362,512,724,1024,1448,2048,2896,4096,5792,8192,11585,16384,23170,32768,46340}; int ftbl2[32]={ 32768,33276,33776,34269,34755,35235,35708,36174,36635,37090,37540,37984,38423,38858,39287,39712,40132,40548,40960,41367,41771,42170,42566,42959,43347,43733,44115,44493,44869,45241,45611,45977}; int fisqrt(int val) { int cnt=0; int t=val; while (t) {cnt++;t&gt;&gt;=1;} if (6&gt;=cnt) t=(val&lt;&lt;(6-cnt)); else t=(val&gt;&gt;(cnt-6)); return (ftbl[cnt]*ftbl2[t&amp;31])&gt;&gt;15; } </code></pre> <p>Here's the code to generate the tables:</p> <pre><code>ftbl[0]=0; for (int i=0;i&lt;32;i++) ftbl[i+1]=sqrt(pow(2.0,i)); printf("int ftbl[33]={0"); for (int i=0;i&lt;32;i++) printf(",%d",ftbl[i+1]); printf("};\n"); for (int i=0;i&lt;32;i++) ftbl2[i]=sqrt(1.0+i/32.0)*32768; printf("int ftbl2[32]={"); for (int i=0;i&lt;32;i++) printf("%c%d",(i)?',':' ',ftbl2[i]); printf("};\n"); </code></pre> <p>Over the range 1->2^20, the maximum error is 11, and over the range 1->2^30, it's about 256. You could use larger tables and minimise this. It's worth mentioning that the error will always be negative - i.e. when it's wrong, the value will be LESS than the correct value.</p> <p>You might do well to follow this with a refining stage.</p> <p>The idea is simple enough: (ab)^0.5 = a^0.b * b^0.5.</p> <p>So, we take the input X = A*B where A=2^N and 1&lt;=B&lt;2 Then we have a lookuptable for sqrt(2^N), and a lookuptable for sqrt(1&lt;=B&lt;2). We store the lookuptable for sqrt(2^N) as integer, which might be a mistake (testing shows no ill effects), and we store the lookuptable for sqrt(1&lt;=B&lt;2) at 15bits of fixed-point.</p> <p>We know that 1&lt;=sqrt(2^N)&lt;65536, so that's 16bit, and we know that we can really only multiply 16bitx15bit on an ARM, without fear of reprisal, so that's what we do.</p> <p>In terms of implementation, the while(t) {cnt++;t>>=1;} is effectively a count-leading-bits instruction (CLB), so if your version of the chipset has that, you're winning! Also, the shift instruction would be easy to implement with a bidirectional shifter, if you have one? There's a Lg[N] algorithm for counting the highest set bit <a href="http://graphics.stanford.edu/~seander/bithacks.html#IntegerLog" rel="noreferrer">here.</a></p> <p>In terms of magic numbers, for changing table sizes, THE magic number for ftbl2 is 32, though note that <em>6</em> (Lg[32]+1) is used for the shifting.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload