Note that there are some explanatory texts on larger screens.

plurals
  1. POWhy are Where and Select outperforming just Select?
    text
    copied!<p>I have a class, like this:</p> <pre><code>public class MyClass { public int Value { get; set; } public bool IsValid { get; set; } } </code></pre> <p><sub>In actual fact it's much larger, but this recreates the problem (weirdness).</sub></p> <p>I want to get the sum of the <code>Value</code>, where the instance is valid. So far, I've found two solutions to this.</p> <h2>The first one is this:</h2> <pre><code>int result = myCollection.Where(mc =&gt; mc.IsValid).Select(mc =&gt; mc.Value).Sum(); </code></pre> <h2>The second one, however, is this:</h2> <pre><code>int result = myCollection.Select(mc =&gt; mc.IsValid ? mc.Value : 0).Sum(); </code></pre> <p>I want to get the most efficient method. I, at first, thought that the second one would be more efficient. Then the theoretical part of me started going "Well, one is O(n + m + m), the other one is O(n + n). The first one should perform better with more invalids, while the second one should perform better with less". I thought that they would perform equally. EDIT: And then @Martin pointed out that the Where and the Select were combined, so it should actually be O(m + n). However, if you look below, it seems like this is not related.</p> <hr> <h1><a href="https://gist.github.com/anonymous/68fc3b49478ee2848a27" rel="noreferrer">So I put it to the test.</a></h1> <p><sub>(It's 100+ lines, so I thought it was better to post it as a Gist.)</sub><br> The results were... interesting.</p> <h2><sub>With 0% tie tolerance:</sub></h2> <p>The scales are in the favour of <code>Select</code> and <code>Where</code>, by about ~30 points.</p> <p><code> How much do you want to be the disambiguation percentage?<br> 0<br> Starting benchmarking.<br> Ties: 0<br> Where + Select: 65<br> Select: 36<br> </code></p> <h2><sub>With 2% tie tolerance:</sub></h2> <p>It's the same, except that for some they were within 2%. I'd say that's a minimum margin of error. <code>Select</code> and <code>Where</code> now have just a ~20 point lead.</p> <p><code> How much do you want to be the disambiguation percentage?<br> 2<br> Starting benchmarking.<br> Ties: 6<br> Where + Select: 58<br> Select: 37<br> </code></p> <h2><sub>With 5% tie tolerance:</sub></h2> <p>This is what I'd say to be my maximum margin of error. It makes it a bit better for the <code>Select</code>, but not much.</p> <p><code> How much do you want to be the disambiguation percentage?<br> 5<br> Starting benchmarking.<br> Ties: 17<br> Where + Select: 53<br> Select: 31<br> </code></p> <h2><sub>With 10% tie tolerance:</sub></h2> <p>This is way out of my margin of error, but I'm still interested in the result. Because it gives the <code>Select</code> and <code>Where</code> the twenty point lead it's had for a while now.</p> <p><code> How much do you want to be the disambiguation percentage?<br> 10<br> Starting benchmarking.<br> Ties: 36<br> Where + Select: 44<br> Select: 21<br> </code></p> <h2><sub>With 25% tie tolerance:</sub></h2> <p>This is way, <strong>way</strong> out of my margin of error, but I'm still interested in the result, because the <code>Select</code> and <code>Where</code> <strong>still</strong> (nearly) keep their 20 point lead. It seems like it's outclassing it in a distinct few, and that's what giving it the lead.</p> <p><code> How much do you want to be the disambiguation percentage?<br> 25<br> Starting benchmarking.<br> Ties: 85<br> Where + Select: 16<br> Select: 0<br> </code></p> <hr> <p>Now, I'm guessing that the 20 point lead came from the middle, where they're both bound to get <strong>around</strong> the same performance. I could try and log it, but it would be a whole load of information to take in. A graph would be better, I guess. </p> <p>So that's what I did.</p> <p><img src="https://i.stack.imgur.com/zQhQS.png" alt="Select vs Select and Where."></p> <p>It shows that the <code>Select</code> line keeps steady (expected) and that the <code>Select + Where</code> line climbs up (expected). However, what puzzles me is why it doesn't meet with the <code>Select</code> at 50 or earlier: in fact I was expecting earlier than 50, as an extra enumerator had to be created for the <code>Select</code> and <code>Where</code>. I mean, this shows the 20-point lead, but it doesn't explain why. This, I guess, is the main point of my question.</p> <h1>Why does it behave like this? Should I trust it? If not, should I use the other one or this one?</h1> <hr> <p>As @KingKong mentioned in the comments, you can also use <code>Sum</code>'s overload that takes a lambda. So my two options are now changed to this:</p> <h2>First:</h2> <pre><code>int result = myCollection.Where(mc =&gt; mc.IsValid).Sum(mc =&gt; mc.Value); </code></pre> <h2>Second:</h2> <pre><code>int result = myCollection.Sum(mc =&gt; mc.IsValid ? mc.Value : 0); </code></pre> <p>I'm going to make it a bit shorter, but:</p> <p><code> How much do you want to be the disambiguation percentage?<br> 0<br> Starting benchmarking.<br> Ties: 0<br> Where: 60<br> Sum: 41<br> How much do you want to be the disambiguation percentage?<br> 2<br> Starting benchmarking.<br> Ties: 8<br> Where: 55<br> Sum: 38<br> How much do you want to be the disambiguation percentage?<br> 5<br> Starting benchmarking.<br> Ties: 21<br> Where: 49<br> Sum: 31<br> How much do you want to be the disambiguation percentage?<br> 10<br> Starting benchmarking.<br> Ties: 39<br> Where: 41<br> Sum: 21<br> How much do you want to be the disambiguation percentage?<br> 25<br> Starting benchmarking.<br> Ties: 85<br> Where: 16<br> Sum: 0<br> </code></p> <p>The twenty-point lead is still there, meaning it doesn't have to do with the <code>Where</code> and <code>Select</code> combination pointed out by @Marcin in the comments.</p> <p><sub>Thanks for reading through my wall of text! Also, if you're interested, <a href="https://gist.github.com/anonymous/0adf47e3c6592f592a2c" rel="noreferrer">here's</a> the modified version that logs the CSV that Excel takes in.</sub></p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload