Note that there are some explanatory texts on larger screens.

plurals
  1. POLazy tree with a space leak
    primarykey
    data
    text
    <p>I'm writing a program trying to implement a toy XML processor. Right now the program is supposed to read a stream of events (think SAX) describing the structure of a document and to build lazily the corresponding tree.</p> <p>The events are defined by the following datatype:</p> <pre><code>data Event = Open String | Close </code></pre> <p>A possible input would then be:</p> <pre><code>[Open "a", Open "b", Close, Open "c", Close, Close] </code></pre> <p>that would correspond to the tree:</p> <pre><code> a / \ b c </code></pre> <p>I would like to generate the tree in a lazy way, so that it does not need to be present in memory in full form at any time. My current implementation, however, seems to have a space leak causing all the nodes to be retained even when they are no longer needed. Here is the code:</p> <pre><code>data Event = Open String | Close data Tree a = Tree a (Trees a) type Trees a = [Tree a] data Node = Node String trees [] = [] trees (Open x : es) = let (children, rest) = splitStream es in (Tree (Node x) (trees children)) : (trees rest) splitStream es = scan 1 es scan depth (s@(Open {}) : ss) = let (b, a) = scan (depth+1) ss in (s:b, a) scan depth (s@Close : ss) = case depth of 1 -&gt; ([], ss) x -&gt; let (b, a) = scan (depth-1) ss in (s:b, a) getChildren = concatMap loop where loop (Tree _ cs) = cs main = print . length . getChildren . trees $ [Open "a"] ++ (concat . replicate 1000000 $ [Open "b", Close]) ++ [Close] </code></pre> <p>The function <code>trees</code> converts the list of events into a list of <code>Tree Node</code>. <code>getChildren</code> collects all the children nodes (labeled "b") of the root ("a"). These are then counted and the resulting number is printed.</p> <p>The compiled program, built with GHC 7.0.4 (-O2), keeps increasing its memory usage up to the point when it prints the node count. I was expecting, on the other hand, an almost constant memory usage.</p> <p>Looking at the "-hd" heap profile, it is clear that most of the memory is taken by the list constructor (<code>:</code>). It seems like one of the lists produced by <code>scan</code> or by <code>trees</code> is retained in full. I don't understand why, however, as <code>length . getChildren</code> should get rid of child nodes as soon as they are traversed.</p> <p>Is there a way to fix such space leak?</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload