Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>There are several problems with your test.</p> <p>First, you are not testing set intersection, but "create a couple of arrays, fill them with random numbers, and then perform set intersection". You should only time the portion of the code you're actually interested in. Even if you're going to want to do those things, they should not be benchmarked here. Measure one thing at a time, to reduce uncertainty. If you want your C++ implementation to perform better, you first need to know which part of it is slower than expected. Which means you have to separate setup code from intersection test.</p> <p>Second, you should run the test a large number of times to take possible caching effects and other uncertainties into account. (And probably output one total time for, say, 1000 runs, rather than an individual time for each. That way you reduce the uncertainty from the timer which might have limited resolution and report inaccurate results when used in the 0-20ms range.</p> <p>Further, as far as I can read from the docs, the input to set_intersection should be sorted, which set2 won't be. An there seems to be no reason to use <code>unordered_map</code>, when <code>unordered_set</code> would be a far better match for what you're doing.</p> <p>About the setup code being needed, note that you probably <em>don't</em> need to populate vectors in order to run the intersection. Both your own implementation and <code>set_intersection</code> work on iterators already, so you can simply pass them a pair of iterators to the data structures your inputs are in already.</p> <p>A few more specific comments on your code:</p> <ul> <li>Use <code>++iterator</code> instead of <code>iterator++</code></li> <li>rather than calling vector.end() at each loop iteration, call it once and cache the result</li> <li>experiment with using sorted vectors vs std::set vs <code>unordered_set</code> (not <code>unordered_map</code>)</li> </ul> <p><strong>Edit:</strong></p> <p>I haven't tried your C# version, so I can't compare the numbers properly, but here's my modified test. Each is run 1000 times, on a Core 2 Quad 2.5GHz with 4GB RAM:</p> <pre><code>std::set_intersection on std::set: 2606ms std::set_intersection on tr1::unordered_set: 1014ms std::set_intersection on sorted vectors: 171ms std::set_intersection on unsorted vectors: 10140ms </code></pre> <p>The last one is a bit unfair, because it has to both copy and sort the vectors. Ideally, only the sort should be part of the benchmark. I tried creating a version that used an array of 1000 unsorted vectors (so I woudln't have to copy the unsorted data in each iteration), but the performance was about the same, or a bit worse, because this would cause constant cache misses, so I reverted back to this version</p> <p>And my code:</p> <pre><code>#define _SECURE_SCL 0 #include &lt;ctime&gt; #include &lt;vector&gt; #include &lt;set&gt; #include &lt;iostream&gt; #include &lt;algorithm&gt; #include &lt;unordered_set&gt; #include &lt;windows.h&gt; template &lt;typename T, typename OutIter&gt; void stl_intersect(const T&amp; set1, const T&amp; set2, OutIter out){ std::set_intersection(set1.begin(), set1.end(), set2.begin(), set2.end(), out); } template &lt;typename T, typename OutIter&gt; void sort_stl_intersect(T&amp; set1, T&amp; set2, OutIter out){ std::sort(set1.begin(), set1.end()); std::sort(set2.begin(), set2.end()); std::set_intersection(set1.begin(), set1.end(), set2.begin(), set2.end(), out); } template &lt;typename T&gt; void init_sorted_vec(T first, T last){ for ( T cur = first; cur != last; ++cur) { int i = cur - first; int value = 1000000000 + i; *cur = value; } } template &lt;typename T&gt; void init_unsorted_vec(T first, T last){ for ( T cur = first; cur != last; ++cur) { int i = rand() % 200000 + 1; i *= 10; int value = 1000000000 + i; *cur = value; } } struct resize_and_shuffle { resize_and_shuffle(int size) : size(size) {} void operator()(std::vector&lt;int&gt;&amp; vec){ vec.resize(size); } int size; }; int main() { srand ( time(NULL) ); std::vector&lt;int&gt; out(100000); std::vector&lt;int&gt; sortedvec1(100000); std::vector&lt;int&gt; sortedvec2(1000); init_sorted_vec(sortedvec1.begin(), sortedvec1.end()); init_unsorted_vec(sortedvec2.begin(), sortedvec2.end()); std::sort(sortedvec2.begin(), sortedvec2.end()); std::vector&lt;int&gt; unsortedvec1(sortedvec1.begin(), sortedvec1.end()); std::vector&lt;int&gt; unsortedvec2(sortedvec2.begin(), sortedvec2.end()); std::random_shuffle(unsortedvec1.begin(), unsortedvec1.end()); std::random_shuffle(unsortedvec2.begin(), unsortedvec2.end()); std::vector&lt;int&gt; vecs1[1000]; std::vector&lt;int&gt; vecs2[1000]; std::fill(vecs1, vecs1 + 1000, unsortedvec1); std::fill(vecs2, vecs2 + 1000, unsortedvec2); std::set&lt;int&gt; set1(sortedvec1.begin(), sortedvec1.end()); std::set&lt;int&gt; set2(sortedvec2.begin(), sortedvec2.end()); std::tr1::unordered_set&lt;int&gt; uset1(sortedvec1.begin(), sortedvec1.end()); std::tr1::unordered_set&lt;int&gt; uset2(sortedvec2.begin(), sortedvec2.end()); DWORD start, stop; DWORD delta[4]; start = GetTickCount(); for (int i = 0; i &lt; 1000; ++i){ stl_intersect(set1, set2, out.begin()); } stop = GetTickCount(); delta[0] = stop - start; start = GetTickCount(); for (int i = 0; i &lt; 1000; ++i){ stl_intersect(uset1, uset2, out.begin()); } stop = GetTickCount(); delta[1] = stop - start; start = GetTickCount(); for (int i = 0; i &lt; 1000; ++i){ stl_intersect(sortedvec1, sortedvec2, out.begin()); } stop = GetTickCount(); delta[2] = stop - start; start = GetTickCount(); for (int i = 0; i &lt; 1000; ++i){ sort_stl_intersect(vecs1[i], vecs1[i], out.begin()); } stop = GetTickCount(); delta[3] = stop - start; std::cout &lt;&lt; "std::set_intersection on std::set: " &lt;&lt; delta[0] &lt;&lt; "ms\n"; std::cout &lt;&lt; "std::set_intersection on tr1::unordered_set: " &lt;&lt; delta[1] &lt;&lt; "ms\n"; std::cout &lt;&lt; "std::set_intersection on sorted vectors: " &lt;&lt; delta[2] &lt;&lt; "ms\n"; std::cout &lt;&lt; "std::set_intersection on unsorted vectors: " &lt;&lt; delta[3] &lt;&lt; "ms\n"; return 0; } </code></pre> <p>There's no reason why C++ should always be faster than C#. C# has a few key advantages that require a lot of care to compete with in C++. The primary one I can think of is that dynamic allocations are ridiculously cheap in .NET-land. Every time a C++ vector, set or unordered_set (or any other container) has to resize or expand, it is a very costly <code>malloc</code> operation. In .NET, a heap allocation is little more than adding an offset to a pointer.</p> <p>So if you want the C++ version to compete, you'll probably have to solve that, allowing your containers to resize without having to perform actual heap allocations, probably by using custom allocators for the containers (perhaps boost::pool might be a good bet, or you can try rolling your own)</p> <p>Another issue is that <code>set_difference</code> only works on sorted input, and in order to reproduce tests results that involve a sort, we have to make a fresh copy of the unsorted data in each iteration, which is costly (although again, using custom allocators will help a lot). I don't know what form your input takes, but it is possible that you can sort your input directly, without copying it, and then run <code>set_difference</code> directly on that. (That would be easy to do if your input is an array or a STL container at least.)</p> <p>One of the key advantages of the STL is that it is so flexible, it can work on pretty much any input sequence. In C#, you pretty much have to copy the input to a List or Dictionary or something, but in C++, you might be able to get away with running <code>std::sort</code> and <code>set_intersection</code> on the raw input.</p> <p>Finally, of course, try running the code through a profiler and see exactly where the time is being spent. You might also want to try running the code through GCC instead. It's my impression that STL performance in MSVC is sometimes a bit quirky. It might be worth testing under another compiler just to see if you get similar timings there.</p> <p>Finally, you might find these blog posts relevant for performance of C++ vs C#: <a href="http://blogs.msdn.com/ricom/archive/2005/05/10/416151.aspx" rel="noreferrer">http://blogs.msdn.com/ricom/archive/2005/05/10/416151.aspx</a></p> <p>The morale of those is essentially that yes, you can get better performance in C++, but it is a surprising amount of work.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload