Note that there are some explanatory texts on larger screens.

plurals
  1. POPerformance issue: comparing to String.Format
    primarykey
    data
    text
    <p>A while back a post by Jon Skeet planted the idea in my head of building a <code>CompiledFormatter</code> class, for using in a loop instead of <code>String.Format()</code>.</p> <p>The idea is that the portion of a call to <code>String.Format()</code> spent parsing the format string is overhead. We <em>should</em> be able to improve performance by moving that code outside of the loop. The trick, of course, is that the new code has to <em>exactly</em> match the <code>String.Format()</code> behavior.</p> <p>This week I finally did it. I actually went through using the <a href="http://weblogs.asp.net/scottgu/archive/2008/01/16/net-framework-library-source-code-now-available.aspx" rel="noreferrer">.Net framework source provided by Microsoft</a> to do a direct adaption of their parser (it turns out <code>String.Format()</code> actually farms the work to <code>StringBuilder.AppendFormat()</code>). The code I came up with works, in that my results are accurate within my (admittedly limited) test data.</p> <p>Unfortunately, I still have one problem: performance. In my initial tests the performance of my code closely matches that of the normal <code>String.Format()</code>. There's no improvement at all: it's even consistently a few milliseconds slower. At least it's still in the same order (ie: the amount slower doesn't increase; it stays within a few milliseconds even as the test set grows), but I was hoping for something better.</p> <p>It's possible that the internal calls to <code>StringBuilder.Append()</code> are what actually drive the performance, but I'd like to see if the smart people here can improve on things any.</p> <p>Here is the relevant portion:</p> <pre><code>private class FormatItem { public int index; //index of item in the argument list. -1 means it's a literal from the original format string public char[] value; //literal data from original format string public string format; //simple format to use with supplied argument (ie: {0:X} for Hex // for fixed-width format (examples below) public int width; // {0,7} means it should be at least 7 characters public bool justify; // {0,-7} would use opposite alignment } //this data is all populated by the constructor private List&lt;FormatItem&gt; parts = new List&lt;FormatItem&gt;(); private int baseSize = 0; private string format; private IFormatProvider formatProvider = null; private ICustomFormatter customFormatter = null; // the code in here very closely matches the code in the String.Format/StringBuilder.AppendFormat methods. // Could it be faster? public String Format(params Object[] args) { if (format == null || args == null) throw new ArgumentNullException((format == null) ? "format" : "args"); var sb = new StringBuilder(baseSize); foreach (FormatItem fi in parts) { if (fi.index &lt; 0) sb.Append(fi.value); else { //if (fi.index &gt;= args.Length) throw new FormatException(Environment.GetResourceString("Format_IndexOutOfRange")); if (fi.index &gt;= args.Length) throw new FormatException("Format_IndexOutOfRange"); object arg = args[fi.index]; string s = null; if (customFormatter != null) { s = customFormatter.Format(fi.format, arg, formatProvider); } if (s == null) { if (arg is IFormattable) { s = ((IFormattable)arg).ToString(fi.format, formatProvider); } else if (arg != null) { s = arg.ToString(); } } if (s == null) s = String.Empty; int pad = fi.width - s.Length; if (!fi.justify &amp;&amp; pad &gt; 0) sb.Append(' ', pad); sb.Append(s); if (fi.justify &amp;&amp; pad &gt; 0) sb.Append(' ', pad); } } return sb.ToString(); } //alternate implementation (for comparative testing) // my own test call String.Format() separately: I don't use this. But it's useful to see // how my format method fits. public string OriginalFormat(params Object[] args) { return String.Format(formatProvider, format, args); } </code></pre> <p>Additional notes:<br> I'm wary of providing the source code for my constructor, because I'm not sure of the licensing implications from my reliance on the original .Net implementation. However, anyone who wants to test this can just make the relevant private data public and assign values that mimic a particular format string.</p> <p>Also, I'm very open to changing the <code>FormatInfo</code> class and even the <code>parts</code> List if anyone has a suggestion that could improve the build time. Since my primary concern is iteration time from front to end maybe a LinkedList would fare better?</p> <p><strong>[Update]:</strong><br> Hmm... something else I can try is adjusting my tests. My benchmarks were fairly simple: composing names to a <code>"{lastname}, {firstname}"</code> format and composing formatted phone numbers from the area code, prefix, number, and extension components. Neither of those have much in the way of string literals. As I think about how the original state machine parser worked, I think those string literals are exactly where my code has the best chance to do well because I no longer have to examine each character in the string.</p> <p>Another thought:<br> This class is still useful, even if I can't make it go faster. As long as performance is <em>no worse</em> than the base String.Format(), I've still created a stronly-typed interface that allows a program to assemble it's own "format string" at run time. All I need to do is provide public access to the parts list. </p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload