Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>I'm not fully up to speed with the 2.9 parallel collections, and I'm not sure <code>concurrent.ops</code> is all that well maintained, but it seems to me that your task is perfectly well suited to futures in 2.8:</p> <pre><code>// Setup--you want to use longs, so you can't use range val x = 4000000000L // Note that this doesn't fit in a signed integer def f(l: Long) = l + 8e9/(3+l) def longRange(a: Long, b: Long) = new Iterator[Long] { private[this] var i = a def hasNext = i&lt;b def next = { val j = i; i += 1; j } } val cpus = 4 val ranges = (1 to cpus).map(i =&gt; longRange(((i-1)*x)/cpus, (i*x)/cpus)) val maxes = ranges.map(r =&gt; scala.actors.Futures.future(r.map(f).max)) println("Total max is " + maxes.map(_()).max) </code></pre> <p>Here you split the work up by hand and ask for a computation of a max over each portion of the range, which is delivered on demand by the iterator. These are computed in the future, that is, the <code>Futures.future</code> returns a promise that it will deliver the return value eventually. The promise is actually kept when <code>myFuture.apply()</code> is called, which in this case is the <code>_()</code> inside the <code>println</code>. To get the total max, you have to take the max of maxes, and this of course can't return until all the work put off to the future is actually completed.</p> <p>You can try comparing the runtime of the four-threaded and single-threaded versions if you want to verify that it's working.</p> <p>(Note that the answer for the function I've provided should be 4.000000001e9.)</p> <p>Note also that if you really want things to run quickly, you should probably write your own range tests:</p> <pre><code>def maxAppliedRange(a: Long, b: Long, f: Long=&gt;Double) = { var m = f(a) var i = a while (i &lt; b) { val x = f(i) if (m &lt; x) m = x i += 1 } m } val maxes = (1 to cpus).map(i =&gt; scala.actors.Futures.future( maxAppliedRange((i-1)*x/cpus,i*x/cpus,f) ) ) println("Total max is " + maxes.map(_()).max) </code></pre> <p>This gives way better performance because there is no boxing/unboxing, and thus the garbage collector isn't stressed, and thus running in parallel gives much better results. This runs ~40x faster for me than the method above, and note that this will also be true with parallel collections. So be careful! Just using more cores isn't necessarily the way to speed up your computations, especially when engaging in a garbage-heavy task.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
    1. COThank you for your thoughtful reply. I tried both examples and found that they both worked, but the second one was a lot faster. I have a few questions though. When I set the cpus variable to 1 my program takes 165 seconds but when I set it to 2 it takes 98 seconds. Why doesn't it take 82 seconds? I expected more of an improvement. Also, my next step is to take my code and supercharge it with gridgain and cloud computing to make it even more powerful. Do you have any advice for me for doing that?
      singulars
    2. CO@Jim - There is overhead associated with setting up the problem and you can't get the answer until the slower of the two threads finishes (and one will be slower if for no other reason than that CPU may be used to do more operating system tasks). I think it would be a very good idea to learn more about parallel computing before wandering onto the cloud. There are numerous issues to think about. What if a machine goes down mid-computation? How do you balance work on slow and fast machines? How do you collect and distribute the work? How do you balance communication with computation?
      singulars
    3. COI did some googling and I see there is a lot to learn about parallel computing. I have a lot of reading to do first. Separately, do you have any suggestions for where I can learn more about using Scala concurrently? I have been to the Scala website and went through their documentation but I want to learn more.
      singulars
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload