Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>OK, it's time for test results of proposed approaches. Here what approaches I have tested (name of each approach is also class name in my sources):</p> <ol> <li><code>NaiveRemoveManyPerformer</code> - <code>ArrayList</code> with iterator and remove - first and naive implementation given in my question.</li> <li><code>BetterNaiveRemoveManyPerformer</code> - <code>ArrayList</code> with backward iteration and removal from end to front.</li> <li><code>LinkedRemoveManyPerformer</code> - naive iterator and remove but working on <code>LinkedList</code>. Disadventage: works only for <code>LinkedList</code>.</li> <li><code>CreateNewRemoveManyPerformer</code> - <code>ArrayList</code> is made as a copy (only retained elements are added), iterator is used to traverse input <code>ArrayList</code>.</li> <li><code>SmartCreateNewRemoveManyPerformer</code> - better <code>CreateNewRemoveManyPerformer</code> - initial size (capacity) of result <code>ArrayList</code> is set to final list size. Disadvantage: you must know final size of list when starting.</li> <li><code>FasterSmartCreateNewRemoveManyPerformer</code> - even better (?) <code>SmartCreateNewRemoveManyPerformer</code> - use item index (<code>items.get(idx)</code>) instead of iterator.</li> <li><code>MagicRemoveManyPerformer</code> - works in place (no list copy) for <code>ArrayList</code> and compacts holes (removed items) from beginning with items from end of the list. Disadventage: this approach changes order of items in list.</li> <li><code>ForwardInPlaceRemoveManyPerformer</code> - works in place for <code>ArrayList</code> - move retaining items to fill holes, finally subList is returned (no final removal or clear).</li> <li><code>GuavaArrayListRemoveManyPerformer</code> - Google Guava <code>Iterables.removeIf</code> used for <code>ArrayList</code> - almost the same as <code>ForwardInPlaceRemoveManyPerformer</code> but does final removal of items at the end of list.</li> </ol> <p>Full source code is given at the end of this answer.</p> <p>Tests where performed with different list sizes (from 10,000 items to 10,000,000 items) and different remove factors (specifying how many items must be removed from list).</p> <p>As I posted here in comments for other answers - I have thought that copying items from <code>ArrayList</code> to second <code>ArrayList</code> will be faster than iterating <code>LinkedList</code> and just removing items. Sun's Java Documentation says that constant factor of <code>ArrayList</code> is low compared to that for the <code>LinkedList</code> implementation, but surprisingly this is not the case in my problem.</p> <p>In practice <code>LinkedList</code> with simple iteration and removal has best performance in most cases (this approach is implemented in <code>LinkedRemoveManyPerformer</code>). Usually only <code>MagicRemoveManyPerformer</code> performance is comparable to <code>LinkedRemoveManyPerformer</code>, other approaches are significantly slower. Google Guava <code>GuavaArrayListRemoveManyPerformer</code> is slower than hand made similar code (because my code does not remove unnecessary items at end of list).</p> <p>Example results for removing 500,000 items from 1,000,000 source items:</p> <ol> <li><code>NaiveRemoveManyPerformer</code>: test not performed - I'm not that patient, but it performs worse than <code>BetterNaiveRemoveManyPerformer</code>.</li> <li><code>BetterNaiveRemoveManyPerformer</code>: 226080 milli(s)</li> <li><code>LinkedRemoveManyPerformer</code>: 69 milli(s)</li> <li><code>CreateNewRemoveManyPerformer</code>: 246 milli(s)</li> <li><code>SmartCreateNewRemoveManyPerformer</code>: 112 milli(s)</li> <li><code>FasterSmartCreateNewRemoveManyPerformer</code>: 202 milli(s)</li> <li><code>MagicRemoveManyPerformer</code>: 74 milli(s)</li> <li><code>ForwardInPlaceRemoveManyPerformer</code>: 69 milli(s)</li> <li><code>GuavaArrayListRemoveManyPerformer</code>: 118 milli(s)</li> </ol> <p>Example results for removing 1 item from 1,000,000 source items (first item is removed):</p> <ol> <li>BetterNaiveRemoveManyPerformer: 34 milli(s)</li> <li>LinkedRemoveManyPerformer: 41 milli(s)</li> <li>CreateNewRemoveManyPerformer: 253 milli(s)</li> <li>SmartCreateNewRemoveManyPerformer: 108 milli(s)</li> <li>FasterSmartCreateNewRemoveManyPerformer: 71 milli(s)</li> <li>MagicRemoveManyPerformer: 43 milli(s)</li> <li>ForwardInPlaceRemoveManyPerformer: 73 milli(s)</li> <li>GuavaArrayListRemoveManyPerformer: 78 milli(s)</li> </ol> <p>Example results for removing 333,334 items from 1,000,000 source items:</p> <ol> <li>BetterNaiveRemoveManyPerformer: 253206 milli(s)</li> <li>LinkedRemoveManyPerformer: 69 milli(s)</li> <li>CreateNewRemoveManyPerformer: 245 milli(s)</li> <li>SmartCreateNewRemoveManyPerformer: 111 milli(s)</li> <li>FasterSmartCreateNewRemoveManyPerformer: 203 milli(s)</li> <li>MagicRemoveManyPerformer: 69 milli(s)</li> <li>ForwardInPlaceRemoveManyPerformer: 72 milli(s)</li> <li>GuavaArrayListRemoveManyPerformer: 102 milli(s)</li> </ol> <p>Example results for removing 1,000,000 (all) items from 1,000,000 source items (all items are removed but with one-by-one processing, if you know a priori that all items are to be removed, list should be simply cleared):</p> <ol> <li>BetterNaiveRemoveManyPerformer: 58 milli(s)</li> <li>LinkedRemoveManyPerformer: 88 milli(s)</li> <li>CreateNewRemoveManyPerformer: 95 milli(s)</li> <li>SmartCreateNewRemoveManyPerformer: 91 milli(s)</li> <li>FasterSmartCreateNewRemoveManyPerformer: 48 milli(s)</li> <li>MagicRemoveManyPerformer: 61 milli(s)</li> <li>ForwardInPlaceRemoveManyPerformer: 49 milli(s)</li> <li>GuavaArrayListRemoveManyPerformer: 133 milli(s)</li> </ol> <p>My final conclusions: use hybrid approach - if dealing with LinkedList - simple iteration and removal is best, if dealing with ArrayList - it depends if item order is important - use ForwardInPlaceRemoveManyPerformer then, if item order may be changed - best choice is MagicRemoveManyPerformer. If remove factor is known a priori (you know how many items will be removed vs retained) then some more conditionals may be put to select approach performing even better in particular situation. But known remove factor is not usual case... Google Guava <code>Iterables.removeIf</code> is such a hybrid solution but with slightly different assumption (original list must be changed, new one cannot be created and item order always matters) - these are most common assumptions so <code>removeIf</code> is best choice in most real-life cases.</p> <p>Notice also that all good approaches (naive is not good!) are good enough - any one of them shold do just fine in real application, but naive approach must be avoided.</p> <p>At last - my source code for testing.</p> <pre><code>package WildWezyrListRemovalTesting; import com.google.common.base.Predicate; import com.google.common.collect.Iterables; import java.util.ArrayList; import java.util.Iterator; import java.util.LinkedList; import java.util.List; public class RemoveManyFromList { public static abstract class BaseRemoveManyPerformer { protected String performerName() { return getClass().getSimpleName(); } protected void info(String msg) { System.out.println(performerName() + ": " + msg); } protected void populateList(List&lt;Integer&gt; items, int itemCnt) { for (int i = 0; i &lt; itemCnt; i++) { items.add(i); } } protected boolean mustRemoveItem(Integer itemVal, int itemIdx, int removeFactor) { if (removeFactor == 0) { return false; } return itemIdx % removeFactor == 0; } protected abstract List&lt;Integer&gt; removeItems(List&lt;Integer&gt; items, int removeFactor); protected abstract List&lt;Integer&gt; createInitialList(); public void testMe(int itemCnt, int removeFactor) { List&lt;Integer&gt; items = createInitialList(); populateList(items, itemCnt); long startMillis = System.currentTimeMillis(); items = removeItems(items, removeFactor); long endMillis = System.currentTimeMillis(); int chksum = 0; for (Integer item : items) { chksum += item; } info("removing took " + (endMillis - startMillis) + " milli(s), itemCnt=" + itemCnt + ", removed items: " + (itemCnt - items.size()) + ", remaining items: " + items.size() + ", checksum: " + chksum); } } private List&lt;BaseRemoveManyPerformer&gt; rmps = new ArrayList&lt;BaseRemoveManyPerformer&gt;(); public void addPerformer(BaseRemoveManyPerformer rmp) { rmps.add(rmp); } private Runtime runtime = Runtime.getRuntime(); private void runGc() { for (int i = 0; i &lt; 5; i++) { runtime.gc(); } } public void testAll(int itemCnt, int removeFactor) { runGc(); for (BaseRemoveManyPerformer rmp : rmps) { rmp.testMe(itemCnt, removeFactor); } runGc(); System.out.println("\n--------------------------\n"); } public static class NaiveRemoveManyPerformer extends BaseRemoveManyPerformer { @Override public List&lt;Integer&gt; removeItems(List&lt;Integer&gt; items, int removeFactor) { if (items.size() &gt; 300000 &amp;&amp; items instanceof ArrayList) { info("this removeItems is too slow, returning without processing"); return items; } int i = 0; Iterator&lt;Integer&gt; iter = items.iterator(); while (iter.hasNext()) { Integer item = iter.next(); if (mustRemoveItem(item, i, removeFactor)) { iter.remove(); } i++; } return items; } @Override public List&lt;Integer&gt; createInitialList() { return new ArrayList&lt;Integer&gt;(); } } public static class BetterNaiveRemoveManyPerformer extends NaiveRemoveManyPerformer { @Override public List&lt;Integer&gt; removeItems(List&lt;Integer&gt; items, int removeFactor) { // if (items.size() &gt; 300000 &amp;&amp; items instanceof ArrayList) { // info("this removeItems is too slow, returning without processing"); // return items; // } for (int i = items.size(); --i &gt;= 0;) { Integer item = items.get(i); if (mustRemoveItem(item, i, removeFactor)) { items.remove(i); } } return items; } } public static class LinkedRemoveManyPerformer extends NaiveRemoveManyPerformer { @Override public List&lt;Integer&gt; createInitialList() { return new LinkedList&lt;Integer&gt;(); } } public static class CreateNewRemoveManyPerformer extends NaiveRemoveManyPerformer { @Override public List&lt;Integer&gt; removeItems(List&lt;Integer&gt; items, int removeFactor) { List&lt;Integer&gt; res = createResultList(items, removeFactor); int i = 0; for (Integer item : items) { if (mustRemoveItem(item, i, removeFactor)) { // no-op } else { res.add(item); } i++; } return res; } protected List&lt;Integer&gt; createResultList(List&lt;Integer&gt; items, int removeFactor) { return new ArrayList&lt;Integer&gt;(); } } public static class SmartCreateNewRemoveManyPerformer extends CreateNewRemoveManyPerformer { @Override protected List&lt;Integer&gt; createResultList(List&lt;Integer&gt; items, int removeFactor) { int newCapacity = removeFactor == 0 ? items.size() : (int) (items.size() * (removeFactor - 1L) / removeFactor + 1); //System.out.println("newCapacity=" + newCapacity); return new ArrayList&lt;Integer&gt;(newCapacity); } } public static class FasterSmartCreateNewRemoveManyPerformer extends SmartCreateNewRemoveManyPerformer { @Override public List&lt;Integer&gt; removeItems(List&lt;Integer&gt; items, int removeFactor) { List&lt;Integer&gt; res = createResultList(items, removeFactor); for (int i = 0; i &lt; items.size(); i++) { Integer item = items.get(i); if (mustRemoveItem(item, i, removeFactor)) { // no-op } else { res.add(item); } } return res; } } public static class ForwardInPlaceRemoveManyPerformer extends NaiveRemoveManyPerformer { @Override public List&lt;Integer&gt; removeItems(List&lt;Integer&gt; items, int removeFactor) { int j = 0; // destination idx for (int i = 0; i &lt; items.size(); i++) { Integer item = items.get(i); if (mustRemoveItem(item, i, removeFactor)) { // no-op } else { if (j &lt; i) { items.set(j, item); } j++; } } return items.subList(0, j); } } public static class MagicRemoveManyPerformer extends NaiveRemoveManyPerformer { @Override public List&lt;Integer&gt; removeItems(List&lt;Integer&gt; items, int removeFactor) { for (int i = 0; i &lt; items.size(); i++) { if (mustRemoveItem(items.get(i), i, removeFactor)) { Integer retainedItem = removeSomeFromEnd(items, removeFactor, i); if (retainedItem == null) { items.remove(i); break; } items.set(i, retainedItem); } } return items; } private Integer removeSomeFromEnd(List&lt;Integer&gt; items, int removeFactor, int lowerBound) { for (int i = items.size(); --i &gt; lowerBound;) { Integer item = items.get(i); items.remove(i); if (!mustRemoveItem(item, i, removeFactor)) { return item; } } return null; } } public static class GuavaArrayListRemoveManyPerformer extends BaseRemoveManyPerformer { @Override protected List&lt;Integer&gt; removeItems(List&lt;Integer&gt; items, final int removeFactor) { Iterables.removeIf(items, new Predicate&lt;Integer&gt;() { public boolean apply(Integer input) { return mustRemoveItem(input, input, removeFactor); } }); return items; } @Override protected List&lt;Integer&gt; createInitialList() { return new ArrayList&lt;Integer&gt;(); } } public void testForOneItemCnt(int itemCnt) { testAll(itemCnt, 0); testAll(itemCnt, itemCnt); testAll(itemCnt, itemCnt - 1); testAll(itemCnt, 3); testAll(itemCnt, 2); testAll(itemCnt, 1); } public static void main(String[] args) { RemoveManyFromList t = new RemoveManyFromList(); t.addPerformer(new NaiveRemoveManyPerformer()); t.addPerformer(new BetterNaiveRemoveManyPerformer()); t.addPerformer(new LinkedRemoveManyPerformer()); t.addPerformer(new CreateNewRemoveManyPerformer()); t.addPerformer(new SmartCreateNewRemoveManyPerformer()); t.addPerformer(new FasterSmartCreateNewRemoveManyPerformer()); t.addPerformer(new MagicRemoveManyPerformer()); t.addPerformer(new ForwardInPlaceRemoveManyPerformer()); t.addPerformer(new GuavaArrayListRemoveManyPerformer()); t.testForOneItemCnt(1000); t.testForOneItemCnt(10000); t.testForOneItemCnt(100000); t.testForOneItemCnt(200000); t.testForOneItemCnt(300000); t.testForOneItemCnt(500000); t.testForOneItemCnt(1000000); t.testForOneItemCnt(10000000); } } </code></pre>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
    1. COIt's great that you went to so much trouble to empirically test these approaches, so it hurts to have to inform you that these measurements are inconclusive. Getting a meaningful microbenchmark of Java code is extremely, extremely difficult. For every one thing you can do right, there are a hundred things you can do wrong, that will skew the results enormously. As just a sample, you need to run the test repeatedly for a good ~10 seconds or so *before* you start measuring. Each measurement should measure "many" repetitions. Continue measuring until the result stabilizes....
      singulars
    2. CO... Measure only *one* thing per VM invocation (this is important). Run each measurement many times -- you'll be surprised how inconsistent the results can be. This only scratches the surface; if you do all these things and follow dozens more pieces of advice, your benchmark results will most likely, like mine, still be of dubious significance. Such is the state of affairs with microbenchmarks in Java.
      singulars
    3. CO@Kevin: that is great that you too are aware of theoretical considerations of microbenchmark difficulties in Java ;-). but what is your point in my particular situation? what is wrong with my code or with my conslusions? please give better code or better conclusions. you already know that my conclusions are wrong? have you seen warm up stage in my code (it is quite hidden but it is there). please give more details, code fixes or just your better conclusions for my tests...
      singulars
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload