Note that there are some explanatory texts on larger screens.

plurals
  1. POparallel computation for an Iterator of elements in Java
    primarykey
    data
    text
    <p>I've had the same need a few times now and wanted to get other thoughts on the right way to structure a solution. The need is to perform some operation on many elements on many threads without needing to have all elements in memory at once, just the ones under computation. As in, <a href="http://google-collections.googlecode.com/svn/trunk/javadoc/com/google/common/collect/Iterables.html#partition(java.lang.Iterable,%20int)" rel="nofollow noreferrer">Iterables.partition</a> is insufficient because it brings all elements into memory up front.</p> <p>Expressing it in code, I want to write a BulkCalc2 that does the same thing as BulkCalc1, just in parallel. Below is sample code that illustrates my best attempt. I'm not satisfied because it's big and ugly, but it does seem to accomplish my goals of keeping threads highly utilized until the work is done, <a href="http://guava-libraries.googlecode.com/svn/trunk/javadoc/com/google/common/base/Throwables.html#propagate(java.lang.Throwable)" rel="nofollow noreferrer">propagating</a> any exceptions during computation, and not having more than <em>numThreads</em> instances of BigThing necessarily in memory at once.</p> <p>I'll accept the answer which meets the stated goals in the most concise way, whether it's a way to improve my BulkCalc2 or a completely different solution.</p> <pre><code>interface BigThing { int getId(); String getString(); } class Calc { // somewhat expensive computation double calc(BigThing bigThing) { Random r = new Random(bigThing.getString().hashCode()); double d = 0; for (int i = 0; i &lt; 100000; i++) { d += r.nextDouble(); } return d; } } class BulkCalc1 { final Calc calc; public BulkCalc1(Calc calc) { this.calc = calc; } public TreeMap&lt;Integer, Double&gt; calc(Iterator&lt;BigThing&gt; in) { TreeMap&lt;Integer, Double&gt; results = Maps.newTreeMap(); while (in.hasNext()) { BigThing o = in.next(); results.put(o.getId(), calc.calc(o)); } return results; } } class SafeIterator&lt;T&gt; { final Iterator&lt;T&gt; in; SafeIterator(Iterator&lt;T&gt; in) { this.in = in; } synchronized T nextOrNull() { if (in.hasNext()) { return in.next(); } return null; } } class BulkCalc2 { final Calc calc; final int numThreads; public BulkCalc2(Calc calc, int numThreads) { this.calc = calc; this.numThreads = numThreads; } public TreeMap&lt;Integer, Double&gt; calc(Iterator&lt;BigThing&gt; in) { ExecutorService e = Executors.newFixedThreadPool(numThreads); List&lt;Future&lt;?&gt;&gt; futures = Lists.newLinkedList(); final Map&lt;Integer, Double&gt; results = new MapMaker().concurrencyLevel(numThreads).makeMap(); final SafeIterator&lt;BigThing&gt; it = new SafeIterator&lt;BigThing&gt;(in); for (int i = 0; i &lt; numThreads; i++) { futures.add(e.submit(new Runnable() { @Override public void run() { while (true) { BigThing o = it.nextOrNull(); if (o == null) { return; } results.put(o.getId(), calc.calc(o)); } } })); } e.shutdown(); for (Future&lt;?&gt; future : futures) { try { future.get(); } catch (InterruptedException ex) { // swallowing is OK } catch (ExecutionException ex) { throw Throwables.propagate(ex.getCause()); } } return new TreeMap&lt;Integer, Double&gt;(results); } } </code></pre>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload