Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Assuming list 1 length is N and list 2 length is M:</p> <p>I would first filter since this is a cheap operation O(N+M) then do the intersection, A relatively expensive operation based on the current implementation. The cost of the Intersect call is complicated and is fundamentally driven by the behaviours of the hash function:</p> <ul> <li>If the hash function is poor then the performance can degrade to O(N*M) (as every string in one list is checked against every stream in the other.</li> <li>If the performance is good then cost is simlpy a lookup in a hash, as such this is O(1), thus a cost of M checks in the hash and cost of M to construct the hash so also O(N+M) in time but with an additional cost of O(N) in space. <ul> <li>The construction of the backing set will be the killer in performance terms.</li> <li>If you knew that both lists were, say, sorted already then you could write your own Intersects check with constant space overhead and O(N+M) worst case running time not to mention excellent memory locality. </li> </ul></li> </ul> <p>This leaves you with:</p> <pre><code>int count = Enumerable.Intersect( list1.Where(word =&gt; word.Length &gt; 3), list2.Where(word =&gt; word.Length &gt; 3)).Count(); </code></pre> <p>Incidentally the Enumerable.Intersect method's performance behaviour may change considerably depending on the order of the arguments.</p> <p>In most cases making the <em>smaller</em> of the two the first argument will produce faster, more memory efficient code and the first argument is used to construct a backing temporary (hash based) Set. This of course is coding to a (hidden) implementation detail so should be considered only if this is shown to be an issue after performance analysis and highlighted as a micro optimization if so.</p> <p>The second filter on list2 is fundamentally unecessary for correctness (since the Intersects will remove such entries anyway)</p> <p>If it quite possible that the following is faster</p> <pre><code>int count = Enumerable.Intersect( list1.Where(word =&gt; word.Length &gt; 3), list2).Count(); </code></pre> <p>However filtering by length is very cheap for long strings compared to calculating their hash codes. The better one will only be found through benchmarking with inputs appropriate to your usage.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload