Note that there are some explanatory texts on larger screens.

plurals
  1. POThe cost of passing by shared_ptr
    primarykey
    data
    text
    <p>I use std::tr1::shared_ptr extensively throughout my application. This includes passing objects in as function arguments. Consider the following:</p> <pre><code>class Dataset {...} void f( shared_ptr&lt; Dataset const &gt; pds ) {...} void g( shared_ptr&lt; Dataset const &gt; pds ) {...} ... </code></pre> <p>While passing a dataset object around via shared_ptr guarantees its existence inside f and g, the functions may be called millions of times, which causes a lot of shared_ptr objects being created and destroyed. Here's a snippet of the flat gprof profile from a recent run:</p> <pre> Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls s/call s/call name 9.74 295.39 35.12 2451177304 0.00 0.00 std::tr1::__shared_count::__shared_count(std::tr1::__shared_count const&) 8.03 324.34 28.95 2451252116 0.00 0.00 std::tr1::__shared_count::~__shared_count() </pre> <p>So, ~17% of the runtime was spent on reference counting with shared_ptr objects. Is this normal?</p> <p>A large portion of my application is single-threaded and I was thinking about re-writing some of the functions as</p> <pre><code>void f( const Dataset&amp; ds ) {...} </code></pre> <p>and replacing the calls</p> <pre><code>shared_ptr&lt; Dataset &gt; pds( new Dataset(...) ); f( pds ); </code></pre> <p>with</p> <pre><code>f( *pds ); </code></pre> <p>in places where I know for sure the object will not get destroyed while the flow of the program is inside f(). But before I run off to change a bunch of function signatures / calls, I wanted to know what the typical performance hit of passing by shared_ptr was. Seems like shared_ptr should not be used for functions that get called very often.</p> <p>Any input would be appreciated. Thanks for reading.</p> <p>-Artem</p> <p><strong>Update:</strong> After changing a handful of functions to accept <code>const Dataset&amp;</code>, the new profile looks like this:</p> <pre> Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls s/call s/call name 0.15 241.62 0.37 24981902 0.00 0.00 std::tr1::__shared_count::~__shared_count() 0.12 241.91 0.30 28342376 0.00 0.00 std::tr1::__shared_count::__shared_count(std::tr1::__shared_count const&) </pre> <p>I'm a little puzzled by the number of destructor calls being smaller than the number of copy constructor calls, but overall I'm very pleased with the decrease in the associated run-time. Thanks to all for their advice.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload