Note that there are some explanatory texts on larger screens.

plurals
  1. POHow to make my Haskell program faster? Comparison with C
    primarykey
    data
    text
    <p>I'm working on an implementation of one of the SHA3 candidates, JH. I'm at the point where the algorithm pass all KATs (Known Answer Tests) provided by NIST, and have also made it an instance of the Crypto-API. Thus I have began looking into its performance. But I'm quite new to Haskell and don't really know what to look for when profiling. </p> <p>At the moment my code is consistently slower then the reference implementation written in C, by a factor of 10 for all input lengths (C code found here: <a href="http://www3.ntu.edu.sg/home/wuhj/research/jh/jh_bitslice_ref64.h" rel="nofollow noreferrer">http://www3.ntu.edu.sg/home/wuhj/research/jh/jh_bitslice_ref64.h</a>). </p> <p>My Haskell code is found here: <a href="https://github.com/hakoja/SHA3/blob/master/Data/Digest/JHInternal.hs" rel="nofollow noreferrer">https://github.com/hakoja/SHA3/blob/master/Data/Digest/JHInternal.hs</a>. </p> <p>Now I don't expect you to wade through all my code, rather I would just want some tips on a couple of functions. I have run some performance tests and this is (part of) the performance file generated by GHC: </p> <pre><code>Tue Oct 25 19:01 2011 Time and Allocation Profiling Report (Final) main +RTS -sstderr -p -hc -RTS jh e False total time = 6.56 secs (328 ticks @ 20 ms) total alloc = 4,086,951,472 bytes (excludes profiling overheads) COST CENTRE MODULE %time %alloc roundFunction Data.Digest.JHInternal 28.4 37.4 word128Shift Data.BigWord.Word128 14.9 19.7 blockMap Data.Digest.JHInternal 11.9 12.9 getBytes Data.Serialize.Get 6.7 2.4 unGet Data.Serialize.Get 5.5 1.3 sbox Data.Digest.JHInternal 4.0 7.4 getWord64be Data.Serialize.Get 3.7 1.6 e8 Data.Digest.JHInternal 3.7 0.0 swap4 Data.Digest.JHInternal 3.0 0.7 swap16 Data.Digest.JHInternal 3.0 0.7 swap8 Data.Digest.JHInternal 1.8 0.7 swap32 Data.Digest.JHInternal 1.8 0.7 parseBlock Data.Digest.JHInternal 1.8 1.2 swap2 Data.Digest.JHInternal 1.5 0.7 swap1 Data.Digest.JHInternal 1.5 0.7 linearTransform Data.Digest.JHInternal 1.5 8.6 shiftl_w64 Data.Serialize.Get 1.2 1.1 Detailed breakdown omitted ... </code></pre> <p>Now quickly about the JH algorithm: </p> <p>It's a hash algorithm which consists of a compression function F8, which is repeated as long as there exists input blocks (of length 512 bits). This is just how the SHA-functions operate. The F8 function consists of the E8 function which applies a round function 42 times. The round function itself consists of three parts: a sbox, a linear transformation and a permutation (called swap in my code).</p> <p>Thus it's reasonable that most of the time is spent in the round function. Still I would like to know how those parts could be improved. For instance: the blockMap function is just a utility function, mapping a function over the elements in a 4-tuple. So why is it performing so badly? Any suggestions would be welcome, and not just on single functions, i.e. are there structural changes you would have done in order to improve the performance?</p> <p>I have tried looking at the Core output, but unfortunately that's way over my head. </p> <p>I attach some of the heap profiles at the end as well in case that could be of interest.</p> <p><strong>EDIT :</strong></p> <p>I forgot to mention my setup and build. I run it on a x86_64 Arch Linux machine, GHC 7.0.3-2 (I think), with compile options:</p> <blockquote> <p>ghc --make -O2 -funbox-strict-fields</p> </blockquote> <p>Unfortunately <a href="http://hackage.haskell.org/trac/ghc/ticket/5043" rel="nofollow noreferrer">there seems to be a bug</a> on the Linux plattform when compiling via C or LLVM, giving me the error:</p> <blockquote> <p>Error: .size expression for XXXX does not evaluate to a constant</p> </blockquote> <p>so I have not been able to see the effect of that.</p> <p><img src="https://i.stack.imgur.com/LiZwL.png" alt="enter image description here"></p> <p><img src="https://i.stack.imgur.com/BJE7W.png" alt="enter image description here"></p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload