Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>I finally got a chance to benchmark the code with a completely unloaded system: <img src="https://i.stack.imgur.com/vtf8z.png" alt="enter image description here"></p> <p>For the dynamic schedule I used <code>schedule(dynamic,1000000)</code>. For the static schedule I used the default (evenly between the cores). For thread binding I used <code>export GOMP_CPU_AFFINITY="0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47"</code>.</p> <p>The main reason for the highly nonlinear scaling for this code is because what AMD calls "cores" aren't actually independent cores. This was part (1) of redrum's answer. This is clearly visible in the plot above from the sudden plateau of speedup at 24 threads; it's really obvious with the dynamic scheduling. It's also obvious from the thread binding that I chose: it turns out what I wrote above would be a terrible choice for binding, because you end up with two threads in each "module".</p> <p>The second biggest slowdown comes from static scheduling with a large number number of threads. Inevitably there is an unbalance between the slowest and fastest threads, introducing large fluctuations in the run time when the iterations are divided in large chunks with the default static scheduling. This part of the answer came both from Hristo's comments and Salt's answer.</p> <p>I don't know why the effects of "Turbo Boost" aren't more pronounced (part 2 of Redrum's answer). Also, I'm not 100% certain where (presumably in overhead) the last bit of the scaling comes is lost (we get 22x performance instead of expected 24x from linear scaling in number of <em>modules</em>). But otherwise the question is pretty well answered.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload