Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>I've made this point before on SO, and so have others.</p> <p>If your object is to improve performance, as measured by wall-clock time, by far the best tool is just the debugger itself, and its "Pause" button. Let me show you why.</p> <p><strong>First, let's look at a good profiler</strong></p> <p>Among profilers, ANTS is probably as good as they come. When I run it on an app, the top of the screen looks like this:</p> <p><img src="https://i.stack.imgur.com/VBop8.jpg" alt="enter image description here"></p> <p>Notice that you have to choose a time span to look at, and you have to choose if you want to look at CPU time or File I/O time. Within that time span, you see something like this:</p> <p><img src="https://i.stack.imgur.com/zSSfw.jpg" alt="enter image description here"></p> <p>which is trying to show what ANTS thinks is the "hot path", considering only CPU time. Of course it emphasizes inclusive "Time With Children (%)", and that's good. In a big code base like this, notice how extremely small the self-time "Time (%)" is? That's typical, and you can see why.</p> <p>What this says is that you should certainly ignore functions that have low inclusive percent, because even if you could reduce them to no-ops, your overall time in that interval would go down by no more than their inclusive percent.</p> <p>So you look at the functions with high inclusive percent, and you try to find something in them to make them take less time, generally by either a) having them make fewer calls to sub-functions, or b) having the function itself be called less.</p> <p>If you find something and fix it, you get a certain percent speedup. Then you can try it all again. When you cannot find anything to fix, you declare victory and put away your profiler for another day.</p> <p>Notice that there might have been additional problems that you could have fixed for more speedup, but if the profiler didn't help you find them, you've assumed they are not there. These can be really big sleepers.</p> <p><strong>Now let's take some manual samples</strong></p> <p>I just randomly paused the app six times during the phase that was bugging me because it was making me wait. Each time I took a snapshot of the call stack, <em>and</em> I took a good long look at what the program was doing and why it was doing it. Three of the samples looked like this:</p> <p>External Code<br> Core.Types.ResourceString.getStringFromResourceFile Line 506<br> Core.Types.ResourceString.getText Line 423<br> Core.Types.ResourceString.ToString Line 299<br> External Code<br> Core.Types.ResourceString.getStringFromResourceFile Line 528<br> Core.Types.ResourceString.getText Line 423<br> Core.Types.ResourceString.ToString Line 299<br> Core.Types.ResourceString.implicit operator string Line 404<br> SplashForm.pluginStarting Line 149<br> Services.Plugins.PluginService.includePlugin Line 737<br> Services.Plugins.PluginService.loadPluginList Line 1015<br> Services.Plugins.PluginService.loadPluginManifests Line 1074<br> Services.Plugins.PluginService.DoStart Line 95<br> Core.Services.ServiceBase.Start Line 36<br> Core.Services.ServiceManager.startService Line 1452<br> Core.Services.ServiceManager.startService Line 1438<br> Core.Services.ServiceManager.loadServices Line 1328<br> Core.Services.ServiceManager.Initialize Line 346<br> Core.Services.ServiceManager.Start Line 298<br> AppStart.Start Line 95<br> AppStart.Main Line 42 </p> <p>Here is what it is doing. It is reading a resource file (that's I/O, so looking at CPU time would not see it). The reason it is reading it is to get the name of a plugin. The reason the name of the plugin is in a resource file is that there <em>might</em> be a future requirement to internationalize that string. Anyway, the reason it is being fetched is so the name can be displayed on a splash screen during the loading of the plugin. Presumably the reason for this is, if the user is wondering what is taking so long, the splash screen will show them what's happening.</p> <p>Those six samples proved that if the name was not displayed, or if it was displayed but was gotten in some more efficient way, then startup speed of the app would approximately <em>double</em>.</p> <p>I hope you can see that no profiler that works by showing measurements could have yielded this insight this quickly.</p> <p>Even if the profiler showed inclusive percent by wall-clock time, not CPU, it still would have left the user trying to puzzle out just what was going on, because in summarizing the times on the routines, it loses almost all explanatory context that tells if what it is doing is <em>necessary</em>.</p> <p>The human tendency when looking only at summary statistics, and looking at the code, is to say "I can see what it's doing, but I don't see any way to improve it."</p> <p><strong>So what about "statistical significance"?</strong></p> <p>I hear this all the time, and it comes from naivete' about statistics.</p> <p>If three out of six samples show a problem, that means the most likely actual percent used by the problem is 3/6=50%. It also means if you did this many times, on average the cost would be (3+1)/(6+2) which is also 50%. If you save 50% of time, that gives a 2x speedup. There is a probability that the cost could be as small as 20%, in which case the speedup would be only 1.25x. There is an equal probability that the cost could be as large as 80%, in which case the speedup would be 5x (!). So yes, it is a gamble. The speedup could be less than estimated, but it will not be zero, and it is equally likely to be dramatically large.</p> <p>If more precision is required, more samples can be taken, but if one sacrifices the insight that comes from examining samples to get statistical precision, the speedups may well not be found.</p> <p>P.S. <a href="https://scicomp.stackexchange.com/a/1870/1262">This link</a> shows the key importance of finding <em>all</em> the problems - not missing any.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload