Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p><del>This is also not a full answer, but I have a few ideas.</del></p> <p><strong>I believe I have found as good an explanation as we will find without somebody from the .NET JIT team answering.</strong></p> <p><strong>UPDATE</strong></p> <p>I looked a little deeper, and I believe I have found the source of the issue. It appears to be caused by a combination of a bug in the JIT type-initialization logic, and a change in the C# compiler that relies on the assumption that the JIT works as intended. I think the JIT bug existed in .NET 4.0, but was uncovered by the change in the compiler for .NET 4.5.</p> <p>I do not think that <code>beforefieldinit</code> is the only issue here. I think it's simpler than that.</p> <p>The type <code>System.String</code> in mscorlib.dll from .NET 4.0 contains a static constructor:</p> <pre><code>.method private hidebysig specialname rtspecialname static void .cctor() cil managed { // Code size 11 (0xb) .maxstack 8 IL_0000: ldstr "" IL_0005: stsfld string System.String::Empty IL_000a: ret } // end of method String::.cctor </code></pre> <p>In the .NET 4.5 version of mscorlib.dll, <code>String.cctor</code> (the static constructor) is conspicuously absent:</p> <blockquote> <p>..... No static constructor :( .....</p> </blockquote> <p>In both versions the <code>String</code> type is adorned with <code>beforefieldinit</code>:</p> <pre><code>.class public auto ansi serializable sealed beforefieldinit System.String </code></pre> <p>I tried to create a type that would compile to IL similarly (so that it has static fields but no static constructor <code>.cctor</code>), but I could not do it. All of these types have a <code>.cctor</code> method in IL:</p> <pre><code>public class MyString1 { public static MyString1 Empty = new MyString1(); } public class MyString2 { public static MyString2 Empty = new MyString2(); static MyString2() {} } public class MyString3 { public static MyString3 Empty; static MyString3() { Empty = new MyString3(); } } </code></pre> <p>My guess is that two things changed between .NET 4.0 and 4.5:</p> <p>First: The EE was changed so that it would automatically initialize <code>String.Empty</code> from unmanaged code. This change was probably made for .NET 4.0.</p> <p>Second: The compiler changed so that it did not emit a static constructor for string, knowing that <code>String.Empty</code> would be assigned from the unmanaged side. This change appears to have been made for .NET 4.5. </p> <p>It appears that the EE <em>does not</em> assign <code>String.Empty</code> soon enough along some optimization paths. The change made to the compiler (or whatever changed to make <code>String.cctor</code> disappear) expected the EE make this assignment before any user code executes, but it appears that the EE does not make this assignment before <code>String.Empty</code> is used in methods of reference type reified generic classes.</p> <p>Lastly, I believe that the bug is indicative of a deeper problem in the JIT type-initialization logic. It appears the change in the compiler is a special case for <code>System.String</code>, but I doubt that the JIT has made a special case here for <code>System.String</code>.</p> <p><strong>Original</strong></p> <p>First of all, WOW The BCL people have gotten very creative with some performance optimizations. <em>Many</em> of the <code>String</code> methods are now performed using a Thread static cached <code>StringBuilder</code> object.</p> <p>I followed that lead for a while, but <code>StringBuilder</code> isn't used on the <code>Trim</code> code path, so I decided it couldn't be a Thread static problem.</p> <p>I think I found a strange manifestation of the same bug though.</p> <p>This code fails with an access violation:</p> <pre><code>class A&lt;T&gt; { static A() { } public A(out string s) { s = string.Empty; } } class B { static void Main() { string s; new A&lt;object&gt;(out s); //new A&lt;int&gt;(out s); System.Console.WriteLine(s.Length); } } </code></pre> <p>However, if you uncomment <code>//new A&lt;int&gt;(out s);</code> in <code>Main</code> then the code works just fine. In fact, if <code>A</code> is reified with any reference type, the program fails, but if <code>A</code> is reified with any value type then the code does not fail. Also if you comment out <code>A</code>'s static constructor, the code never fails. After digging into <code>Trim</code> and <code>Format</code>, it is clear that the problem is that <code>Length</code> is being inlined, and that in these samples above the <code>String</code> type has not been initialized. In particular, inside the body of <code>A</code>'s constructor, <code>string.Empty</code> is not correctly assigned, although inside the body of <code>Main</code>, <code>string.Empty</code> is assigned correctly. </p> <p>It is amazing to me that the type initialization of <code>String</code> somehow depends on whether or not <code>A</code> is reified with a value type. My only theory is that there is some optimizing JIT code path for generic type-initialization that is shared among all types, and that that path makes assumptions about BCL reference types ("special types?") and their state. A quick look though other BCL classes with <code>public static</code> fields shows that basically <em>all</em> of them implement a static constructor (even those with empty constructors and no data, like <code>System.DBNull</code> and <code>System.Empty</code>. BCL value types with <code>public static</code> fields do not seem to implement a static constructor (<code>System.IntPtr</code>, for instance). This seems to indicate that the JIT makes some assumptions about BCL reference type initialization.</p> <p>FYI Here is the JITed code for the two versions:</p> <p><strong><code>A&lt;object&gt;.ctor(out string)</code></strong>:</p> <pre><code> public A(out string s) { 00000000 push rbx 00000001 sub rsp,20h 00000005 mov rbx,rdx 00000008 lea rdx,[FFEE38D0h] 0000000f mov rcx,qword ptr [rcx] 00000012 call 000000005F7AB4A0 s = string.Empty; 00000017 mov rdx,qword ptr [FFEE38D0h] 0000001e mov rcx,rbx 00000021 call 000000005F661180 00000026 nop 00000027 add rsp,20h 0000002b pop rbx 0000002c ret } </code></pre> <p><strong><code>A&lt;int32&gt;.ctor(out string)</code></strong>:</p> <pre><code> public A(out string s) { 00000000 sub rsp,28h 00000004 mov rax,rdx s = string.Empty; 00000007 mov rdx,12353250h 00000011 mov rdx,qword ptr [rdx] 00000014 mov rcx,rax 00000017 call 000000005F691160 0000001c nop 0000001d add rsp,28h 00000021 ret } </code></pre> <p>The rest of the code (<code>Main</code>) is identical between the two versions.</p> <p><strong>EDIT</strong></p> <p>In addition, the IL from the two versions is identical except for the call to <code>A.ctor</code> in <code>B.Main()</code>, where the IL for the first version contains:</p> <pre><code>newobj instance void class A`1&lt;object&gt;::.ctor(string&amp;) </code></pre> <p>versus</p> <pre><code>... A`1&lt;int32&gt;... </code></pre> <p>in the second.</p> <p>Another thing to note is that the JITed code for <strong><code>A&lt;int&gt;.ctor(out string)</code></strong>: is the same as in the non-generic version.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload