Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Ok, so I thought I would chime in with an update of what I have done and the results obtained thus far. Changes made:</p> <ul> <li>Switched from Array to UnboxedArray (made Word128 an instance type)</li> <li>Used UnboxedArray + fold in e8 instead of lists and (prelude) fold</li> <li>Used unsafeIndex instead of !</li> <li>Changed type of Block1024 to a real datatype (similiar to Block512), and unpacked its arguments</li> <li>Updated GHC to version 7.2.1 on Arch Linux, thus fixing the problem with compiling via C or LLVM</li> <li>Switched mod to rem in some places, but NOT in <strong>roundFunction</strong>. When I do it there, the compile time suddenly takes an awful lot of time, and the run time becomes 10 times slower! Does anyone know why that may be? It is only happening with GHC-7.2.1, not GHC-7.0.3 </li> </ul> <p>I compile with the following options: </p> <blockquote> <p>ghc-7.2.1 --make -O2 -funbox-strict-fields main.hs ./Tests/testframe.hs -fvia-C -optc-O2</p> </blockquote> <p>And the results? Roughly 50 % reduction in time. On an input of ~107 MB, the code now use 3 minutes as compared to the previous 6-7 minutes. The C version uses 42 seconds.</p> <p>Things I tried, but which didn't result in better performance:</p> <ul> <li><p>Unrolled the e8 function like this:</p> <blockquote> <p>e8 !h = go h 0</p> <p>where go !x !n</p> <pre><code> | n == 42 = x | otherwise = go h' (n + 1) where !h' = roundFunction x n </code></pre> </blockquote></li> <li><p>Tried breaking up the swapN functions to use the underlying Word64' directly:</p> <blockquote> <p>swap1 (W xh hl) = </p> <pre><code> shiftL (W (xh .&amp;. 0x5555555555555555) (xl .&amp;. 0x5555555555555555)) 1 .|. shiftR (W (xh .&amp;. 0xaaaaaaaaaaaaaaaa) (xl .&amp;. 0xaaaaaaaaaaaaaaaa)) 1 </code></pre> </blockquote></li> <li><p>Tried using the LLVM backend</p></li> </ul> <p>All of these attempts gave worse performance than what I have currently. I don't know if thats because I'm doing it wrong (especially the unrolling of e8), or because they just are worse options.</p> <p>Still I have some new questions with these new tweaks.</p> <ol> <li><p>Suddenly I have gotten this peculiar bump in memory usage. Take a look at following heap profiles: <img src="https://i.stack.imgur.com/mvQ7b.png" alt="enter image description here"> <img src="https://i.stack.imgur.com/zvyN5.png" alt="enter image description here"></p> <p>Why has this happened? Is it because of the UnboxedArray? And what does SYSTEM mean?</p></li> <li><p>When I compile via C I get the following warning:</p> <blockquote> <p>Warning: The -fvia-C flag does nothing; it will be removed in a future GHC release</p> </blockquote> <p>Is this true? Why then, do I see better performance using it, rather than not? </p></li> </ol>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload