Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>As my original answer was generally derided, I've had another go.</p> <pre class="lang-cs prettyprint-override"><code>Int32 maxUniques = 1; Int32 collectionSize = 100000000; Random rand = new Random(); while (maxUniques &lt;= collectionSize) { List&lt;Int32&gt; bigCollection = new List&lt;Int32&gt;(); bigCollection.Capacity = collectionSize; for (Int32 count = 0; count &lt; collectionSize; count++) bigCollection.Add(rand.Next(maxUniques)); HashSet&lt;Int32&gt; uniqueSources = new HashSet&lt;Int32&gt;(); Stopwatch watch = new Stopwatch(); watch.Start(); foreach (Int32 num in bigCollection) { if (!uniqueSources.Contains(num)) uniqueSources.Add(num); } Console.WriteLine(String.Format("With {0,10:N0} unique values in a set of {1,10:N0} values, the time taken for conditional add: {2,6:N0} ms", uniqueSources.Count, collectionSize, watch.ElapsedMilliseconds)); uniqueSources = new HashSet&lt;Int32&gt;(); watch.Restart(); foreach (Int32 num in bigCollection) { uniqueSources.Add(num); } Console.WriteLine(String.Format("With {0,10:N0} unique values in a set of {1,10:N0} values, the time taken for simple add: {2,6:N0} ms", uniqueSources.Count, collectionSize, watch.ElapsedMilliseconds)); Console.WriteLine(); maxUniques *= 10; } </code></pre> <p>Which gave the following output:</p> <blockquote> <p>With 1 unique values in a set of 100,000,000 values, the time taken for conditional add: 2,004 ms With 1 unique values in a set of 100,000,000 values, the time taken for simple add: 2,540 ms</p> <p>With 10 unique values in a set of 100,000,000 values, the time taken for conditional add: 2,066 ms With 10 unique values in a set of 100,000,000 values, the time taken for simple add: 2,391 ms</p> <p>With 100 unique values in a set of 100,000,000 values, the time taken for conditional add: 2,057 ms With 100 unique values in a set of 100,000,000 values, the time taken for simple add: 2,410 ms</p> <p>With 1,000 unique values in a set of 100,000,000 values, the time taken for conditional add: 2,011 ms With 1,000 unique values in a set of 100,000,000 values, the time taken for simple add: 2,459 ms</p> <p>With 10,000 unique values in a set of 100,000,000 values, the time taken for conditional add: 2,219 ms<br> With 10,000 unique values in a set of 100,000,000 values, the time taken for simple add: 2,414 ms</p> <p>With 100,000 unique values in a set of 100,000,000 values, the time taken for conditional add: 3,024 ms<br> With 100,000 unique values in a set of 100,000,000 values, the time taken for simple add: 3,124 ms</p> <p>With 1,000,000 unique values in a set of 100,000,000 values, the time taken for conditional add: 8,937 ms<br> With 1,000,000 unique values in a set of 100,000,000 values, the time taken for simple add: 9,310 ms</p> <p>With 9,999,536 unique values in a set of 100,000,000 values, the time taken for conditional add: 11,798 ms<br> With 9,999,536 unique values in a set of 100,000,000 values, the time taken for simple add: 11,660 ms</p> <p>With 63,199,938 unique values in a set of 100,000,000 values, the time taken for conditional add: 20,847 ms<br> With 63,199,938 unique values in a set of 100,000,000 values, the time taken for simple add: 20,213 ms</p> </blockquote> <p>Which is curious to me. </p> <p>Up to 1% additions, it is faster to call the Contains() method rather than just keep hitting the Add(). For 10% and 63%, it was faster to just Add().</p> <p>To put it another way:<br> 100 million Contains() is faster than 99 million Add()<br> 100 million Contains() is slower than 90 million Add()</p> <p>I adjusted the code to try 1 million to 10 million unique values in 1 million increments and discovered the inflection point is somewhere around 7-10%, the results weren't conclusive.</p> <p>So if you're expecting less than 7% of values to be added, it's faster to call Contains() first. More than 7%, just call Add().</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload