Note that there are some explanatory texts on larger screens.

plurals
  1. POIO over big files in haskell: Performance issue
    primarykey
    data
    text
    <p>I'm trying to work over big files using Haskell. I'd like to browse an input file byte after byte, and to generate an output byte after byte. Of course I need the IO to be buffered with blocks of reasonable size (a few KB). I can't do it, and I need your help please.</p> <pre><code>import System import qualified Data.ByteString.Lazy as BL import Data.Word import Data.List main :: IO () main = do args &lt;- System.getArgs let filename = head args byteString &lt;- BL.readFile filename let wordsList = BL.unpack byteString let foldFun acc word = doSomeStuff word : acc let wordsListCopy = foldl' foldFun [] wordsList let byteStringCopy = BL.pack (reverse wordsListCopy) BL.writeFile (filename ++ ".cpy") byteStringCopy where doSomeStuff = id </code></pre> <p>I name this file <code>TestCopy.hs</code>, then do the following:</p> <pre class="lang-none prettyprint-override"><code>$ ls -l *MB -rwxrwxrwx 1 root root 10000000 2011-03-24 13:11 10MB -rwxrwxrwx 1 root root 5000000 2011-03-24 13:31 5MB $ ghc --make -O TestCopy.hs [1 of 1] Compiling Main ( TestCopy.hs, TestCopy.o ) Linking TestCopy ... $ time ./TestCopy 5MB real 0m5.631s user 0m1.972s sys 0m2.488s $ diff 5MB 5MB.cpy $ time ./TestCopy 10MB real 3m6.671s user 0m3.404s sys 1m21.649s $ diff 10MB 10MB.cpy $ time ./TestCopy 10MB +RTS -K500M -RTS real 2m50.261s user 0m3.808s sys 1m13.849s $ diff 10MB 10MB.cpy $ </code></pre> <p>My problem: There is a huge difference between a 5MB and a 10 MB file. I'd like the performances to be linear in the size of the input file. Please what am i doing wrong, and how can I achieve this? I don't mind using lazy bytestrings or anything else as long as it works, but it has to be a standard ghc library.</p> <p>Precision: It's for a university project. And I'm not trying to copy files. The <code>doSomeStuff</code> function shall perform compression/decompression actions that I have to customize.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload