Note that there are some explanatory texts on larger screens.

plurals
  1. PORemove from a HashSet failing after iterating over it
    primarykey
    data
    text
    <p>I'm writing an agglomerative clustering algorithm in java and having trouble with a remove operation. It seems to always fail when the number of clusters reaches half the initial number.</p> <p>In the sample code below, <code>clusters</code> is a <code>Collection&lt;Collection&lt;Integer&gt;&gt;</code>.</p> <pre><code> while(clusters.size() &gt; K){ // determine smallest distance between clusters Collection&lt;Integer&gt; minclust1 = null; Collection&lt;Integer&gt; minclust2 = null; double mindist = Double.POSITIVE_INFINITY; for(Collection&lt;Integer&gt; cluster1 : clusters){ for(Collection&lt;Integer&gt; cluster2 : clusters){ if( cluster1 != cluster2 &amp;&amp; getDistance(cluster1, cluster2) &lt; mindist){ minclust1 = cluster1; minclust2 = cluster2; mindist = getDistance(cluster1, cluster2); } } } // merge the two clusters minclust1.addAll(minclust2); clusters.remove(minclust2); } </code></pre> <p>After a few runs through the loop, <code>clusters.remove(minclust2)</code> eventually returns false, but I don't understand why.</p> <p>I tested this code by first creating 10 clusters, each with one integer from 1 to 10. Distances are random numbers between 0 and 1. Here's the output after adding a few println statements. After the number of clusters, I print out the actual clusters, the merge operation, and the result of clusters.remove(minclust2).</p> <pre><code>Clustering: 10 clusters [[3], [1], [10], [5], [9], [7], [2], [4], [6], [8]] [5] &lt;- [6] true Clustering: 9 clusters [[3], [1], [10], [5, 6], [9], [7], [2], [4], [8]] [7] &lt;- [8] true Clustering: 8 clusters [[3], [1], [10], [5, 6], [9], [7, 8], [2], [4]] [10] &lt;- [9] true Clustering: 7 clusters [[3], [1], [10, 9], [5, 6], [7, 8], [2], [4]] [5, 6] &lt;- [4] true Clustering: 6 clusters [[3], [1], [10, 9], [5, 6, 4], [7, 8], [2]] [3] &lt;- [2] true Clustering: 5 clusters [[3, 2], [1], [10, 9], [5, 6, 4], [7, 8]] [10, 9] &lt;- [5, 6, 4] false Clustering: 5 clusters [[3, 2], [1], [10, 9, 5, 6, 4], [5, 6, 4], [7, 8]] [10, 9, 5, 6, 4] &lt;- [5, 6, 4] false Clustering: 5 clusters [[3, 2], [1], [10, 9, 5, 6, 4, 5, 6, 4], [5, 6, 4], [7, 8]] [10, 9, 5, 6, 4, 5, 6, 4] &lt;- [5, 6, 4] false </code></pre> <p>The the [10, 9, 5, 6, 4, 5, 6, 4, ...] set just grows infinitely from there. </p> <p><strong>Edit: to clarify, I'm using a <code>HashSet&lt;Integer&gt;</code> for each cluster in clusters (a <code>HashSet&lt;HashSet&lt;Integer&gt;&gt;)</code>.</strong></p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload