Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<pre><code>static double s_x; </code></pre> <p>It is much harder to demonstrate the effect when you use a double. The CPU uses dedicated instructions to load and store a double, respectively FLD and FSTP. It is much easier with <em>long</em> since there is no single instruction that load/stores a 64-bit integer in 32-bit mode. To observe it you need to have the variable's address misaligned so it straddles the cpu cache line boundary.</p> <p>That will never happen with the declaration you used, the JIT compiler ensures that the double is aligned properly, stored at an address that's a multiple of 8. You could store it in a field of a class, the GC allocator only aligns to 4 in 32-bit mode. But that's a crap shoot.</p> <p>Best way to do it is by intentionally mis-aligning the double by using a pointer. Put <em>unsafe</em> in front of the Program class and make it look similar to this:</p> <pre><code> static double* s_x; static void Main(string[] args) { var mem = Marshal.AllocCoTaskMem(100); s_x = (double*)((long)(mem) + 28); TestTearingDouble(); } ThreadA: *s_x = ((i &amp; 1) == 0) ? 0.0 : double.MaxValue; ThreadB: double x = *s_x; </code></pre> <p>This still won't guarantee a good misalignment (hehe) since there's no way to control exactly where AllocCoTaskMem() will align the allocation relative to the start of the cpu cache line. And it depends on the cache associativity in your cpu core (mine is a Core i5). You'll have to tinker with the offset, I got the value 28 by experimentation. The value should be divisible by 4 but not by 8 to truly simulate the GC heap behavior. Keep adding 8 to the value until you get the double to straddle the cache line and trigger the assert.</p> <p>To make it less artificial you'll have to write a program that stores the double in field of a class and get the garbage collector to move it around in memory so it gets misaligned. Kinda hard to come up with a sample program that <em>ensures</em> this happens.</p> <p>Also note how your program can demonstrate a problem called <em>false sharing</em>. Comment out the Start() method call for thread B and note how much faster thread A runs. You are seeing the cost of the cpu keeping the cache line consistent between the cpu cores. Sharing is intended here since the threads access the same variable. Real false sharing happens when threads access different variables that are stored in the same cache line. This is otherwise why alignment matters, you can only observe the tearing for a double when part of it is in one cache line and part of it is in another.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload