Note that there are some explanatory texts on larger screens.

plurals
  1. POPerformance of Floyd-Warshall in Haskell – Fixing a space leak
    text
    copied!<p>I wanted to write an efficient implementation of the Floyd-Warshall all pairs shortest path algorithm in Haskell using <code>Vector</code>s to hopefully get good performance.</p> <p>The implementation is quite straight-forward, but instead of using a 3-dimensional |V|×|V|×|V| matrix, a 2-dimensional vector is used, since we only ever read the previous <code>k</code> value.</p> <p>Thus, the algorithm is really just a series of steps where a 2D vector is passed in, and a new 2D vector is generated. The final 2D vector contains the shortest paths between all nodes (i,j).</p> <p>My intuition told me that it would be important to make sure that the previous 2D vector was evaluated before each step, so I used <code>BangPatterns</code> on the <code>prev</code> argument to the <code>fw</code> function and the strict <code>foldl'</code>:</p> <pre><code>{-# Language BangPatterns #-} import Control.DeepSeq import Control.Monad (forM_) import Data.List (foldl') import qualified Data.Map.Strict as M import Data.Vector (Vector, (!), (//)) import qualified Data.Vector as V import qualified Data.Vector.Mutable as V hiding (length, replicate, take) type Graph = Vector (M.Map Int Double) type TwoDVector = Vector (Vector Double) infinity :: Double infinity = 1/0 -- calculate shortest path between all pairs in the given graph, if there are -- negative cycles, return Nothing allPairsShortestPaths :: Graph -&gt; Int -&gt; Maybe TwoDVector allPairsShortestPaths g v = let initial = fw g v V.empty 0 results = foldl' (fw g v) initial [1..v] in if negCycle results then Nothing else Just results where -- check for negative elements along the diagonal negCycle a = any not $ map (\i -&gt; a ! i ! i &gt;= 0) [0..(V.length a-1)] -- one step of the Floyd-Warshall algorithm fw :: Graph -&gt; Int -&gt; TwoDVector -&gt; Int -&gt; TwoDVector fw g v !prev k = V.create $ do -- ← bang curr &lt;- V.new v forM_ [0..(v-1)] $ \i -&gt; V.write curr i $ V.create $ do ivec &lt;- V.new v forM_ [0..(v-1)] $ \j -&gt; do let d = distance g prev i j k V.write ivec j d return ivec return curr distance :: Graph -&gt; TwoDVector -&gt; Int -&gt; Int -&gt; Int -&gt; Double distance g _ i j 0 -- base case; 0 if same vertex, edge weight if neighbours | i == j = 0.0 | otherwise = M.findWithDefault infinity j (g ! i) distance _ a i j k = let c1 = a ! i ! j c2 = (a ! i ! (k-1))+(a ! (k-1) ! j) in min c1 c2 </code></pre> <p>However, when running this program with a 1000-node graph with 47978 edges, things does not look good at all. The memory usage is very high and the program takes way too long to run. The program was compiled with <code>ghc -O2</code>.</p> <p>I rebuilt the program for profiling, and limited the number of iterations to 50:</p> <pre><code> results = foldl' (fw g v) initial [1..50] </code></pre> <p>I then ran the program with <code>+RTS -p -hc</code> and <code>+RTS -p -hd</code>:</p> <p><img src="https://i.stack.imgur.com/JftTx.png" alt=""></p> <p><img src="https://i.stack.imgur.com/OLrzo.png" alt=""></p> <p>This is... interesting, but I guess it's showing that it's accumulating tonnes of thunks. Not good.</p> <p>Ok, so after a few shots in the dark, I added a <code>deepseq</code> in <code>fw</code> to make sure <code>prev</code> <em>really</em> is evaluted:</p> <pre><code>let d = prev `deepseq` distance g prev i j k </code></pre> <p>Now things look better, and I can actually run the program to completion with constant memory usage. It's obvious that the bang on the <code>prev</code> argument was not enough.</p> <p>For comparison with the previous graphs, here is the memory usage for 50 iterations after adding the <code>deepseq</code>:</p> <p><img src="https://i.stack.imgur.com/9xr5T.png" alt=""></p> <p><img src="https://i.stack.imgur.com/0SMVD.png" alt=""></p> <p>Ok, so things are better, but I still have some questions:</p> <ol> <li>Is this the correct solution for this space leak? I am wrong in feeling that inserting a <code>deepseq</code> is a bit ugly?</li> <li>Is my usage of <code>Vector</code>s here idiomatic/correct? I'm building a completely new vector for every iteration and hoping that the garbage collector will delete the old <code>Vector</code>s.</li> <li>Is there any other things I could do to make this run faster with this approach?</li> </ol> <p>For references, here is <code>graph.txt</code>: <a href="http://sebsauvage.net/paste/?45147f7caf8c5f29#7tiCiPovPHWRm1XNvrSb/zNl3ujF3xB3yehrxhEdVWw=" rel="nofollow noreferrer">http://sebsauvage.net/paste/?45147f7caf8c5f29#7tiCiPovPHWRm1XNvrSb/zNl3ujF3xB3yehrxhEdVWw=</a></p> <p>Here is <code>main</code>:</p> <pre><code>main = do ls &lt;- fmap lines $ readFile "graph.txt" let numVerts = head . map read . words . head $ ls let edges = map (map read . words) (tail ls) let g = V.create $ do g' &lt;- V.new numVerts forM_ [0..(numVerts-1)] (\idx -&gt; V.write g' idx M.empty) forM_ edges $ \[f,t,w] -&gt; do -- subtract one from vertex IDs so we can index directly curr &lt;- V.read g' (f-1) V.write g' (f-1) $ M.insert (t-1) (fromIntegral w) curr return g' let a = allPairsShortestPaths g numVerts case a of Nothing -&gt; putStrLn "Negative cycle detected." Just a' -&gt; do putStrLn $ "The shortest, shortest path has length " ++ show ((V.minimum . V.map V.minimum) a') </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload