Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>In practice, getting the parallel combinators to scale well can be difficult. Others have mentioned making your code more strict to ensure you are actually doing the work in parallel, which is definitely important.</p> <p>Two things that can really kill performance are lots of memory traversal and garbage collections. Even if you are not producing a lot of garbage, lots of memory traversals put more pressure on the CPU cache and eventually your memory bus becomes the bottle neck. Your <code>isStopWord</code> function performs a lot of string comparisons and has to traverse a rather long linked list to do so. You can save a lot of work using the builtin <code>Set</code> type or, even better, the <code>HashSet</code> type from the <code>unordered-containers</code> package (since repeated string comparisons can be expensive, especially if they share commons prefixes).</p> <pre><code>import Data.HashSet (HashSet) import qualified Data.HashSet as S ... finnishStop :: [Text] finnishStop = ["minä", "sinä", "hän", "kuitenkin", "jälkeen", "mukaanlukien", "koska", "mutta", "jos", "kuitenkin", "kun", "kunnes", "sanoo", "sanoi", "sanoa", "miksi", "vielä", "sinun"] englishStop :: [Text] englishStop = ["a","able","about","across","after","all","almost","also","am","among","an","and","any","are","as","at","be","because","been","but","by","can","cannot","could","dear","did","do","does","either","else","ever","every","for","from","get","got","had","has","have","he","her","hers","him","his","how","however","i","if","in","into","is","it","its","just","least","let","like","likely","may","me","might","most","must","my","neither","no","nor","not","of","off","often","on","only","or","other","our","own","rather","said","say","says","she","should","since","so","some","than","that","the","their","them","then","there","these","they","this","tis","to","too","twas","us","wants","was","we","were","what","when","where","which","while","who","whom","why","will","with","would","yet","you","your"] stopWord :: HashSet Text stopWord = S.fromList (finnishStop ++ englishStop) isStopWord :: Text -&gt; Bool isStopWord x = x `S.member` stopWord </code></pre> <p>Replacing your <code>isStopWord</code> function with this version performs much better and scales much better (though definitely not 1-1). You could also consider using <code>HashMap</code> (from the same package) rather than <code>Map</code> for the same reasons, but I did not get a noticeable change from doing so.</p> <p>Another option is to increase the default heap size to take some of the pressure off the GC and to give it more room to move things around. Giving the compiled code a default heap size of 1GB (<code>-H1G</code> flag), I get a GC balance of about 50% on 4 cores, whereas I only get ~25% without (it also runs ~30% faster).</p> <p>With these two alterations, the average runtime on four cores (on my machine) drops from ~10.5s to ~3.5s. Arguably, there is room for improvement based on the GC statistics (still only spends 58% of the time doing productive work), but doing significantly better might require a much more drastic change to your algorithm.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload