Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>You are able to use ILP in CLI. So the short answer is No.</p> <p>A bit longer:</p> <p>I wrote a code for a simple image processing task before, and used this kind of optimazation to made my code a "bit" faster.</p> <p>A "short" example:</p> <pre><code>static void Main( string[] args ) { const int ITERATION_NUMBER = 100; TimeSpan[] normal = new TimeSpan[ITERATION_NUMBER]; TimeSpan[] ilp = new TimeSpan[ITERATION_NUMBER]; int SIZE = 4000000; float[] data = new float[SIZE]; float safe = 0.0f; //Normal for Stopwatch sw = new Stopwatch(); for (int iteration = 0; iteration &lt; ITERATION_NUMBER; iteration++) { //Initialization for (int i = 0; i &lt; data.Length; i++) { data[i] = 1.0f; } sw.Start(); for (int index = 0; index &lt; data.Length; index++) { data[index] /= 3.0f * data[index] &gt; 2.0f / data[index] ? 2.0f / data[index] : 3.0f * data[index]; } sw.Stop(); normal[iteration] = sw.Elapsed; safe = data[0]; //Initialization for (int i = 0; i &lt; data.Length; i++) { data[i] = 1.0f; } sw.Reset(); //ILP For sw.Start(); float ac1, ac2, ac3, ac4; int length = data.Length / 4; for (int i = 0; i &lt; length; i++) { int index0 = i &lt;&lt; 2; int index1 = index0; int index2 = index0 + 1; int index3 = index0 + 2; int index4 = index0 + 3; ac1 = 3.0f * data[index1] &gt; 2.0f / data[index1] ? 2.0f / data[index1] : 3.0f * data[index1]; ac2 = 3.0f * data[index2] &gt; 2.0f / data[index2] ? 2.0f / data[index2] : 3.0f * data[index2]; ac3 = 3.0f * data[index3] &gt; 2.0f / data[index3] ? 2.0f / data[index3] : 3.0f * data[index3]; ac4 = 3.0f * data[index4] &gt; 2.0f / data[index4] ? 2.0f / data[index4] : 3.0f * data[index4]; data[index1] /= ac1; data[index2] /= ac2; data[index3] /= ac3; data[index4] /= ac4; } sw.Stop(); ilp[iteration] = sw.Elapsed; sw.Reset(); } Console.WriteLine(data.All(item =&gt; item == data[0])); Console.WriteLine(data[0] == safe); Console.WriteLine(); double normalElapsed = normal.Max(time =&gt; time.TotalMilliseconds); Console.WriteLine(String.Format("Normal Max.: {0}", normalElapsed)); double ilpElapsed = ilp.Max(time =&gt; time.TotalMilliseconds); Console.WriteLine(String.Format("ILP Max.: {0}", ilpElapsed)); Console.WriteLine(); normalElapsed = normal.Average(time =&gt; time.TotalMilliseconds); Console.WriteLine(String.Format("Normal Avg.: {0}", normalElapsed)); ilpElapsed = ilp.Average(time =&gt; time.TotalMilliseconds); Console.WriteLine(String.Format("ILP Avg.: {0}", ilpElapsed)); Console.WriteLine(); normalElapsed = normal.Min(time =&gt; time.TotalMilliseconds); Console.WriteLine(String.Format("Normal Min.: {0}", normalElapsed)); ilpElapsed = ilp.Min(time =&gt; time.TotalMilliseconds); Console.WriteLine(String.Format("ILP Min.: {0}", ilpElapsed)); } </code></pre> <p>Results are (on .Net framework 4.0 Client profile, Release):</p> <p><strong>On a Virtual Machine</strong> (I think with no ILP):</p> <p>True True</p> <p>Nor Max.: 111,1894 <br> ILP Max.: 106,886</p> <p>Nor Avg.: 78,163619 <br> ILP Avg.: 77,682513</p> <p>Nor Min.: 58,3035 <br> ILP Min.: 56,7672</p> <p><strong>On a Xenon</strong>:</p> <p>True True</p> <p>Nor Max.: 40,5892 <br> ILP Max.: 30,8906</p> <p>Nor Avg.: 35,637308 <br> ILP Avg.: 25,45341</p> <p>Nor Min.: 34,4247 <br> ILP Min.: 23,7888</p> <p>Explanation of Results:</p> <p>In Debug, there is no optization applyed by the compiler, but the second for loop is more optimal than the first so there is a significant difference.</p> <p>The answer seems to be in the results of the execution of Release mode builded assemblies. The IL compiler/JIT-er make it's best to minimize the performance counsumption (I think even ILP). But whether you make a code like the second for loop, you can reach better results in special cases, and second loop can overperform the first one on some achitectures. But </p> <blockquote> <p>You are at the mercy of the JIT</p> </blockquote> <p>as mentioned, sadly. Sad thing there is no mention of implementation could define more optimization, like ILP (a short paragraph can be placed in the specification). But they can not enumerate every form of architectural optomizations of code, and CLI is on a higher level:</p> <blockquote> <p>This is well abstracted away from the .NET languages and IL.</p> </blockquote> <p>This is a very complex problem to answer it only experimental way. I don't think we could get much more precise answer a way like this. And I think the Question is missleading becuse it isn't depending on C#, it depends on the implementation of CLI.</p> <p>There could be many influencing factors, and it makes hard to answer correctly a question like this thinking about JIT until we think it as Black Box.</p> <p>I found things about loop vectorization and autothreading on page 512-513.: <a href="http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-335.pdf" rel="nofollow">http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-335.pdf</a></p> <p>I think they don't specify explicitly how the JIT-er need to behave in cases like this, and impelenters can choose the way of optimization. So I think you can impact, if you can write more optimal code, and JIT will try to use the ILP if it is possible/implemented.</p> <p>I think because they don't specify, there is a possibility.</p> <p>So the answer seems to be No. I belive you can't abstract away from ILP in the case of CLI, if the specification doesn't say it.</p> <p><strong>Update</strong>:</p> <p>I found a blog post before, but I haven't found it until now: <a href="http://igoro.com/archive/gallery-of-processor-cache-effects/" rel="nofollow">http://igoro.com/archive/gallery-of-processor-cache-effects/</a> Example four contains a short, but proper answer for your question.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload