Note that there are some explanatory texts on larger screens.

plurals
  1. PODoes the WKWYL optimization in make_shared<>() introduce a penalty for some multithreading applications?
    primarykey
    data
    text
    <p>A few days ago I happened to watch <a href="http://channel9.msdn.com/Events/GoingNative/GoingNative-2012/STL11-Magic-Secrets">this very interesting presentation</a> by Stephan T. Lavavej, which mentions the "<em>We Know Where You Live</em>" optimization (sorry for using the acronym in the question title, SO warned me the question might have been closed otherwise), and <a href="http://www.youtube.com/watch?v=L7zSU9HI-6I&amp;feature=gv">this beautiful one</a> by Herb Sutter on machine architecture.</p> <p>Briefly, the "<em>We Know Where You Live</em>" optimization consists in placing the reference counters on the same memory block as the object which <code>make_shared</code> is creating, thus resulting in one single memory allocation rather than two and making <code>shared_ptr</code> more compact.</p> <p>After summing up what I learnt from the two presentations above, however, I started to wonder whether the WKWYL optimization could not <strong>degrade performance</strong> in case <code>shared_ptr</code> is accessed by multiple threads <em>running on different cores</em>.</p> <p>If the reference counters are <em>close</em> to the actual object in memory, in fact, they should be more likely to be fetched into the <strong>same cache line</strong> as the object itself. This in turn, if I got the lesson correctly, would make it more likely that threads will slow down while competing for the same cache line even when they do not need to.</p> <p>Suppose <strong>one</strong> of the threads needs to <strong>update the reference counter</strong> several times (e.g. when copying the <code>shared_ptr</code> around), while <strong>the other ones just need to access the pointed object</strong>: isn't this going to slow down the execution of <strong>all</strong> threads by making them compete for the same cache line? </p> <p><strong>If the refcount lived somewhere else in memory, I would say contention would be less likely to arise</strong>.</p> <p>Does this make a good argument against using <code>make_shared()</code> in similar cases (as long as it implements the WKWYL optimization, of course)? Or is there a fallacy in my reasoning?</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload