Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<blockquote> <p>I was wondering whether lambdas together with move semantics or any other new feature can do as good as ETs. Any thoughts?</p> </blockquote> <p><strong>Quick Answer</strong></p> <p>Move semantics are not a total panacea on their own --techniques such as expression templates (ETs) are still needed in C++11 to eliminate overheads such as moving data around! So, to answer your question quickly before diving into the rest of my answer, move semantics, etc. doesn't completely replace ETs as my answer illustrates below.</p> <p><strong>Detailed Answer</strong></p> <p>ETs typically return proxy objects to defer evaluation until later, so there is no immediate apparent benefit of C++11 language features until the code that triggers the computation. That said, one would not want to write ET code, however, that triggers run-time code generation during the building of the expression tree with the proxies. Nicely, C++11's move semantics and perfect forwarding can help avoid such overheads should that otherwise occur. (Such would not have been possible in C++03.)</p> <p>Essentially, when writing ETs one wants to exploit the language features in a way to generate optimal code once the member function(s) of the involved proxy objects are invoked. In C++11 this will include using perfect forwarding, move semantics over copying, etc. if such is actually still needed over and above what the compiler can already do. The name of the game is to minimize the run-time code generated and/or maximize the run-time speed and/or minimize the run-time overhead.</p> <p>I wanted to actually try some ETs with C++11 features to see if I could elide ALL intermediate temporary instance types with a <code>a = b + c + d;</code> expression. (As this was just a fun break from my normal activities so I did not compare it to or write ET code purely using C++03. Also I did not worry about all aspects of code polishing that appears below.)</p> <p>To start with, I did not use lambdas --as I preferred to use explicit types and functions-- so I won't argue for/against lambdas with respect to your question. My guess is that they would be similar to using functors and performing no better than the non-ET code below (i.e., moves would be required) --at least until compilers can automatically optimize lambdas using their own internal ETs for such. The code I wrote, however, exploits move semantics and perfect forwarding. Here's what I did starting with the results and then finally presenting the code.</p> <p>I created a <code>math_vector&lt;N&gt;</code> class where <code>N==3</code> and it defines an internal private instance of <code>std::array&lt;long double, N&gt;</code>. The members are a default constructor, copy and move constructors and assignments, an initializer list constructor, a destructor, a swap() member, operator [] to access elements of the vector and operator +=. Used without any expression templates, this code:</p> <pre class="lang-cpp prettyprint-override"><code>{ cout &lt;&lt; "CASE 1:\n"; math_vector&lt;3&gt; a{1.0, 1.1, 1.2}; math_vector&lt;3&gt; b{2.0, 2.1, 2.2}; math_vector&lt;3&gt; c{3.0, 3.1, 3.2}; math_vector&lt;3&gt; d{4.0, 4.1, 4.2}; math_vector&lt;3&gt; result = a + b + c + d; cout &lt;&lt; '[' &lt;&lt; &amp;result &lt;&lt; "]: " &lt;&lt; result &lt;&lt; "\n"; } </code></pre> <p>outputs (when compiled with <code>clang++</code> 3.1 or <code>g++</code> 4.8 with -<code>std=c++11 -O3</code>):</p> <pre class="lang-none prettyprint-override"><code>CASE 1: 0x7fff8d6edf50: math_vector(initlist) 0x7fff8d6edef0: math_vector(initlist) 0x7fff8d6ede90: math_vector(initlist) 0x7fff8d6ede30: math_vector(initlist) 0x7fff8d6edd70: math_vector(copy: 0x7fff8d6edf50) 0x7fff8d6edda0: math_vector(move: 0x7fff8d6edd70) 0x7fff8d6eddd0: math_vector(move: 0x7fff8d6edda0) 0x7fff8d6edda0: ~math_vector() 0x7fff8d6edd70: ~math_vector() [0x7fff8d6eddd0]: (10,10.4,10.8) 0x7fff8d6eddd0: ~math_vector() 0x7fff8d6ede30: ~math_vector() 0x7fff8d6ede90: ~math_vector() 0x7fff8d6edef0: ~math_vector() 0x7fff8d6edf50: ~math_vector() </code></pre> <p>i.e., the four explicit constructed instances using initializer lists (i.e., the <code>initlist</code> items), the <code>result</code> variable (i.e., <code>0x7fff8d6eddd0</code>), and, also makes an additional three objects copying and moving.</p> <p>To only focus on temporaries and moving, I created a second case that only creates <code>result</code> as a named variable --all others are rvalues:</p> <pre class="lang-cpp prettyprint-override"><code>{ cout &lt;&lt; "CASE 2:\n"; math_vector&lt;3&gt; result = math_vector&lt;3&gt;{1.0, 1.1, 1.2} + math_vector&lt;3&gt;{2.0, 2.1, 2.2} + math_vector&lt;3&gt;{3.0, 3.1, 3.2} + math_vector&lt;3&gt;{4.0, 4.1, 4.2} ; cout &lt;&lt; '[' &lt;&lt; &amp;result &lt;&lt; "]: " &lt;&lt; result &lt;&lt; "\n"; } </code></pre> <p>which outputs this (again when ETs are NOT used):</p> <pre class="lang-none prettyprint-override"><code>CASE 2: 0x7fff8d6edcb0: math_vector(initlist) 0x7fff8d6edc50: math_vector(initlist) 0x7fff8d6edce0: math_vector(move: 0x7fff8d6edcb0) 0x7fff8d6edbf0: math_vector(initlist) 0x7fff8d6edd10: math_vector(move: 0x7fff8d6edce0) 0x7fff8d6edb90: math_vector(initlist) 0x7fff8d6edd40: math_vector(move: 0x7fff8d6edd10) 0x7fff8d6edb90: ~math_vector() 0x7fff8d6edd10: ~math_vector() 0x7fff8d6edbf0: ~math_vector() 0x7fff8d6edce0: ~math_vector() 0x7fff8d6edc50: ~math_vector() 0x7fff8d6edcb0: ~math_vector() [0x7fff8d6edd40]: (10,10.4,10.8) 0x7fff8d6edd40: ~math_vector() </code></pre> <p>which is better: only extra move objects are created.</p> <p>But I wanted better: I wanted zero extra temporaries and to have the code as if I hard-coded it with the one normal coding caveat: all explicitly instantiated types would still be created (i.e., the four <code>initlist</code> constructors and <code>result</code>). To accomplish this I then added expression template code as follows:</p> <ol> <li>a proxy <code>math_vector_expr&lt;LeftExpr,BinaryOp,RightExpr&gt;</code> class was created to hold an expression not computed yet,</li> <li>a proxy <code>plus_op</code> class was created to hold the addition operation,</li> <li>a constructor was added to <code>math_vector</code> to accept a <code>math_vector_expr</code> object, and,</li> <li>"starter" member functions were added to trigger the creation of the expression template.</li> </ol> <p>The results using ETs are wonderful: no extra temporaries in either case! The previous two cases above now output:</p> <pre class="lang-none prettyprint-override"><code>CASE 1: 0x7fffe7180c60: math_vector(initlist) 0x7fffe7180c90: math_vector(initlist) 0x7fffe7180cc0: math_vector(initlist) 0x7fffe7180cf0: math_vector(initlist) 0x7fffe7180d20: math_vector(expr: 0x7fffe7180d90) [0x7fffe7180d20]: (10,10.4,10.8) 0x7fffe7180d20: ~math_vector() 0x7fffe7180cf0: ~math_vector() 0x7fffe7180cc0: ~math_vector() 0x7fffe7180c90: ~math_vector() 0x7fffe7180c60: ~math_vector() CASE 2: 0x7fffe7180dd0: math_vector(initlist) 0x7fffe7180e20: math_vector(initlist) 0x7fffe7180e70: math_vector(initlist) 0x7fffe7180eb0: math_vector(initlist) 0x7fffe7180d20: math_vector(expr: 0x7fffe7180dc0) 0x7fffe7180eb0: ~math_vector() 0x7fffe7180e70: ~math_vector() 0x7fffe7180e20: ~math_vector() 0x7fffe7180dd0: ~math_vector() [0x7fffe7180d20]: (10,10.4,10.8) 0x7fffe7180d20: ~math_vector() </code></pre> <p>i.e., exactly 5 constructor calls and 5 destructor calls in each case. In fact, if you ask the compiler to generate the assembler code between the 4 <code>initlist</code> constructor calls and the outputting of <code>result</code> one gets this beautiful string of assembler code:</p> <pre><code>fldt 128(%rsp) leaq 128(%rsp), %rdi leaq 80(%rsp), %rbp fldt 176(%rsp) faddp %st, %st(1) fldt 224(%rsp) faddp %st, %st(1) fldt 272(%rsp) faddp %st, %st(1) fstpt 80(%rsp) fldt 144(%rsp) fldt 192(%rsp) faddp %st, %st(1) fldt 240(%rsp) faddp %st, %st(1) fldt 288(%rsp) faddp %st, %st(1) fstpt 96(%rsp) fldt 160(%rsp) fldt 208(%rsp) faddp %st, %st(1) fldt 256(%rsp) faddp %st, %st(1) fldt 304(%rsp) faddp %st, %st(1) fstpt 112(%rsp) </code></pre> <p>with <code>g++</code> and <code>clang++</code> outputs similar (even smaller) code. No function calls, etc. --just a bunch of adds which is EXACTLY what one wants!</p> <p>The C++11 code to achieve this follows. Simply <code>#define DONT_USE_EXPR_TEMPL</code> to not use ETs or don't define it at all to use ETs.</p> <pre class="lang-cpp prettyprint-override"><code>#include &lt;array&gt; #include &lt;algorithm&gt; #include &lt;initializer_list&gt; #include &lt;type_traits&gt; #include &lt;iostream&gt; //#define DONT_USE_EXPR_TEMPL //=========================================================================== template &lt;std::size_t N&gt; class math_vector; template &lt; typename LeftExpr, typename BinaryOp, typename RightExpr &gt; class math_vector_expr { public: math_vector_expr() = delete; math_vector_expr(LeftExpr l, RightExpr r) : l_(std::forward&lt;LeftExpr&gt;(l)), r_(std::forward&lt;RightExpr&gt;(r)) { } // Prohibit copying... math_vector_expr(math_vector_expr const&amp;) = delete; math_vector_expr&amp; operator =(math_vector_expr const&amp;) = delete; // Allow moves... math_vector_expr(math_vector_expr&amp;&amp;) = default; math_vector_expr&amp; operator =(math_vector_expr&amp;&amp;) = default; template &lt;typename RE&gt; auto operator +(RE&amp;&amp; re) const -&gt; math_vector_expr&lt; math_vector_expr&lt;LeftExpr,BinaryOp,RightExpr&gt; const&amp;, BinaryOp, decltype(std::forward&lt;RE&gt;(re)) &gt; { return math_vector_expr&lt; math_vector_expr&lt;LeftExpr,BinaryOp,RightExpr&gt; const&amp;, BinaryOp, decltype(std::forward&lt;RE&gt;(re)) &gt;(*this, std::forward&lt;RE&gt;(re)) ; } auto le() -&gt; typename std::add_lvalue_reference&lt;LeftExpr&gt;::type { return l_; } auto le() const -&gt; typename std::add_lvalue_reference&lt; typename std::add_const&lt;LeftExpr&gt;::type &gt;::type { return l_; } auto re() -&gt; typename std::add_lvalue_reference&lt;RightExpr&gt;::type { return r_; } auto re() const -&gt; typename std::add_lvalue_reference&lt; typename std::add_const&lt;RightExpr&gt;::type &gt;::type { return r_; } auto operator [](std::size_t index) const -&gt; decltype( BinaryOp::apply(this-&gt;le()[index], this-&gt;re()[index]) ) { return BinaryOp::apply(le()[index], re()[index]); } private: LeftExpr l_; RightExpr r_; }; //=========================================================================== template &lt;typename T&gt; struct plus_op { static T apply(T const&amp; a, T const&amp; b) { return a + b; } static T apply(T&amp;&amp; a, T const&amp; b) { a += b; return std::move(a); } static T apply(T const&amp; a, T&amp;&amp; b) { b += a; return std::move(b); } static T apply(T&amp;&amp; a, T&amp;&amp; b) { a += b; return std::move(a); } }; //=========================================================================== template &lt;std::size_t N&gt; class math_vector { using impl_type = std::array&lt;long double, N&gt;; public: math_vector() { using namespace std; fill(begin(v_), end(v_), impl_type{}); std::cout &lt;&lt; this &lt;&lt; ": math_vector()" &lt;&lt; endl; } math_vector(math_vector const&amp; mv) noexcept { using namespace std; copy(begin(mv.v_), end(mv.v_), begin(v_)); std::cout &lt;&lt; this &lt;&lt; ": math_vector(copy: " &lt;&lt; &amp;mv &lt;&lt; ")" &lt;&lt; endl; } math_vector(math_vector&amp;&amp; mv) noexcept { using namespace std; move(begin(mv.v_), end(mv.v_), begin(v_)); std::cout &lt;&lt; this &lt;&lt; ": math_vector(move: " &lt;&lt; &amp;mv &lt;&lt; ")" &lt;&lt; endl; } math_vector(std::initializer_list&lt;typename impl_type::value_type&gt; l) { using namespace std; copy(begin(l), end(l), begin(v_)); std::cout &lt;&lt; this &lt;&lt; ": math_vector(initlist)" &lt;&lt; endl; } math_vector&amp; operator =(math_vector const&amp; mv) noexcept { using namespace std; copy(begin(mv.v_), end(mv.v_), begin(v_)); std::cout &lt;&lt; this &lt;&lt; ": math_vector op =(copy: " &lt;&lt; &amp;mv &lt;&lt; ")" &lt;&lt; endl; return *this; } math_vector&amp; operator =(math_vector&amp;&amp; mv) noexcept { using namespace std; move(begin(mv.v_), end(mv.v_), begin(v_)); std::cout &lt;&lt; this &lt;&lt; ": math_vector op =(move: " &lt;&lt; &amp;mv &lt;&lt; ")" &lt;&lt; endl; return *this; } ~math_vector() { using namespace std; std::cout &lt;&lt; this &lt;&lt; ": ~math_vector()" &lt;&lt; endl; } void swap(math_vector&amp; mv) { using namespace std; for (std::size_t i = 0; i&lt;N; ++i) swap(v_[i], mv[i]); } auto operator [](std::size_t index) const -&gt; typename impl_type::value_type const&amp; { return v_[index]; } auto operator [](std::size_t index) -&gt; typename impl_type::value_type&amp; { return v_[index]; } math_vector&amp; operator +=(math_vector const&amp; b) { for (std::size_t i = 0; i&lt;N; ++i) v_[i] += b[i]; return *this; } #ifndef DONT_USE_EXPR_TEMPL template &lt;typename LE, typename Op, typename RE&gt; math_vector(math_vector_expr&lt;LE,Op,RE&gt;&amp;&amp; mve) { for (std::size_t i = 0; i &lt; N; ++i) v_[i] = mve[i]; std::cout &lt;&lt; this &lt;&lt; ": math_vector(expr: " &lt;&lt; &amp;mve &lt;&lt; ")" &lt;&lt; std::endl; } template &lt;typename RightExpr&gt; math_vector&amp; operator =(RightExpr&amp;&amp; re) { for (std::size_t i = 0; i&lt;N; ++i) v_[i] = re[i]; return *this; } template &lt;typename RightExpr&gt; math_vector&amp; operator +=(RightExpr&amp;&amp; re) { for (std::size_t i = 0; i&lt;N; ++i) v_[i] += re[i]; return *this; } template &lt;typename RightExpr&gt; auto operator +(RightExpr&amp;&amp; re) const -&gt; math_vector_expr&lt; math_vector const&amp;, plus_op&lt;typename impl_type::value_type&gt;, decltype(std::forward&lt;RightExpr&gt;(re)) &gt; { return math_vector_expr&lt; math_vector const&amp;, plus_op&lt;typename impl_type::value_type&gt;, decltype(std::forward&lt;RightExpr&gt;(re)) &gt;( *this, std::forward&lt;RightExpr&gt;(re) ) ; } #endif // #ifndef DONT_USE_EXPR_TEMPL private: impl_type v_; }; //=========================================================================== template &lt;std::size_t N&gt; inline void swap(math_vector&lt;N&gt;&amp; a, math_vector&lt;N&gt;&amp; b) { a.swap(b); } //=========================================================================== #ifdef DONT_USE_EXPR_TEMPL template &lt;std::size_t N&gt; inline math_vector&lt;N&gt; operator +( math_vector&lt;N&gt; const&amp; a, math_vector&lt;N&gt; const&amp; b ) { math_vector&lt;N&gt; retval(a); retval += b; return retval; } template &lt;std::size_t N&gt; inline math_vector&lt;N&gt; operator +( math_vector&lt;N&gt;&amp;&amp; a, math_vector&lt;N&gt; const&amp; b ) { a += b; return std::move(a); } template &lt;std::size_t N&gt; inline math_vector&lt;N&gt; operator +( math_vector&lt;N&gt; const&amp; a, math_vector&lt;N&gt;&amp;&amp; b ) { b += a; return std::move(b); } template &lt;std::size_t N&gt; inline math_vector&lt;N&gt; operator +( math_vector&lt;N&gt;&amp;&amp; a, math_vector&lt;N&gt;&amp;&amp; b ) { a += std::move(b); return std::move(a); } #endif // #ifdef DONT_USE_EXPR_TEMPL //=========================================================================== template &lt;std::size_t N&gt; std::ostream&amp; operator &lt;&lt;(std::ostream&amp; os, math_vector&lt;N&gt; const&amp; mv) { os &lt;&lt; '('; for (std::size_t i = 0; i &lt; N; ++i) os &lt;&lt; mv[i] &lt;&lt; ((i+1 != N) ? ',' : ')'); return os; } //=========================================================================== int main() { using namespace std; try { { cout &lt;&lt; "CASE 1:\n"; math_vector&lt;3&gt; a{1.0, 1.1, 1.2}; math_vector&lt;3&gt; b{2.0, 2.1, 2.2}; math_vector&lt;3&gt; c{3.0, 3.1, 3.2}; math_vector&lt;3&gt; d{4.0, 4.1, 4.2}; math_vector&lt;3&gt; result = a + b + c + d; cout &lt;&lt; '[' &lt;&lt; &amp;result &lt;&lt; "]: " &lt;&lt; result &lt;&lt; "\n"; } cout &lt;&lt; endl; { cout &lt;&lt; "CASE 2:\n"; math_vector&lt;3&gt; result = math_vector&lt;3&gt;{1.0, 1.1, 1.2} + math_vector&lt;3&gt;{2.0, 2.1, 2.2} + math_vector&lt;3&gt;{3.0, 3.1, 3.2} + math_vector&lt;3&gt;{4.0, 4.1, 4.2} ; cout &lt;&lt; '[' &lt;&lt; &amp;result &lt;&lt; "]: " &lt;&lt; result &lt;&lt; "\n"; } } catch (...) { return 1; } } //=========================================================================== </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload