Note that there are some explanatory texts on larger screens.

plurals
  1. POString Benchmarks in C# - Refactoring for Speed/Maintainability
    primarykey
    data
    text
    <p>I've been tinkering with small functions on my own time, trying to find ways to refactor them (I recently read Martin Fowler's book <em><a href="http://rads.stackoverflow.com/amzn/click/0201485672" rel="nofollow noreferrer">Refactoring: Improving the Design of Existing Code</a></em>). I found the following function <code>MakeNiceString()</code> while updating another part of the codebase near it, and it looked like a good candidate to mess with. As it is, there's no real reason to replace it, but it's small enough and does something small so it's easy to follow and yet still get a 'good' experience from.</p> <pre><code>private static string MakeNiceString(string str) { char[] ca = str.ToCharArray(); string result = null; int i = 0; result += System.Convert.ToString(ca[0]); for (i = 1; i &lt;= ca.Length - 1; i++) { if (!(char.IsLower(ca[i]))) { result += " "; } result += System.Convert.ToString(ca[i]); } return result; } static string SplitCamelCase(string str) { string[] temp = Regex.Split(str, @"(?&lt;!^)(?=[A-Z])"); string result = String.Join(" ", temp); return result; } </code></pre> <p>The first function <code>MakeNiceString()</code> is the function I found in some code I was updating at work. The purpose of the function is to translate <strong>ThisIsAString</strong> to <strong>This Is A String</strong>. It's used in a half-dozen places in the code, and is pretty insignificant in the whole scheme of things. </p> <p>I built the second function purely as an academic exercise to see if using a regular expression would take longer or not.</p> <p>Well, here are the results:</p> <p>With 10 Iterations:</p> <blockquote> <pre> MakeNiceString took 2649 ticks SplitCamelCase took 2502 ticks </pre> </blockquote> <p>However, it changes drastically over the longhaul:</p> <p>With 10,000 Iterations:</p> <pre> MakeNiceString took 121625 ticks SplitCamelCase took 443001 ticks </pre> <hr> <h3>Refactoring <code>MakeNiceString()</code></h3> <blockquote> <p>The process of refactoring <code>MakeNiceString()</code> started with simply removing the conversions that were taking place. Doing that yielded the following results:</p> </blockquote> <pre> MakeNiceString took 124716 ticks ImprovedMakeNiceString took 118486 </pre> <p>Here's the code after Refactor #1:</p> <pre><code>private static string ImprovedMakeNiceString(string str) { //Removed Convert.ToString() char[] ca = str.ToCharArray(); string result = null; int i = 0; result += ca[0]; for (i = 1; i &lt;= ca.Length - 1; i++) { if (!(char.IsLower(ca[i]))) { result += " "; } result += ca[i]; } return result; } </code></pre> <h3>Refactor#2 - Use <code>StringBuilder</code></h3> <blockquote> <p>My second task was to use <code>StringBuilder</code> instead of <code>String</code>. Since <code>String</code> is immutable, unnecessary copies were being created throughout the loop. The benchmark for using that is below, as is the code:</p> </blockquote> <pre><code>static string RefactoredMakeNiceString(string str) { char[] ca = str.ToCharArray(); StringBuilder sb = new StringBuilder((str.Length * 5 / 4)); int i = 0; sb.Append(ca[0]); for (i = 1; i &lt;= ca.Length - 1; i++) { if (!(char.IsLower(ca[i]))) { sb.Append(" "); } sb.Append(ca[i]); } return sb.ToString(); } </code></pre> <p>This results in the following Benchmark:</p> <blockquote> <pre> MakeNiceString Took: 124497 Ticks //Original SplitCamelCase Took: 464459 Ticks //Regex ImprovedMakeNiceString Took: 117369 Ticks //Remove Conversion RefactoredMakeNiceString Took: 38542 Ticks //Using StringBuilder </pre> </blockquote> <p>Changing the <code>for</code> loop to a <code>foreach</code> loop resulted in the following benchmark result:</p> <pre><code>static string RefactoredForEachMakeNiceString(string str) { char[] ca = str.ToCharArray(); StringBuilder sb1 = new StringBuilder((str.Length * 5 / 4)); sb1.Append(ca[0]); foreach (char c in ca) { if (!(char.IsLower(c))) { sb1.Append(" "); } sb1.Append(c); } return sb1.ToString(); } </code></pre> <blockquote> <pre> RefactoredForEachMakeNiceString Took: 45163 Ticks </pre> </blockquote> <p>As you can see, maintenance-wise, the <code>foreach</code> loop will be the easiest to maintain and have the 'cleanest' look. It is slightly slower than the <code>for</code> loop, but infinitely easier to follow. </p> <h3>Alternate Refactor: Use Compiled <code>Regex</code></h3> <p>I moved the Regex to right before the loop is begun, in hopes that since it only compiles it once, it'll execute faster. What I found out (and I'm sure I have a bug somewhere) is that that doesn't happen like it ought to:</p> <pre><code>static void runTest5() { Regex rg = new Regex(@"(?&lt;!^)(?=[A-Z])", RegexOptions.Compiled); for (int i = 0; i &lt; 10000; i++) { CompiledRegex(rg, myString); } } static string CompiledRegex(Regex regex, string str) { string result = null; Regex rg1 = regex; string[] temp = rg1.Split(str); result = String.Join(" ", temp); return result; } </code></pre> <h3>Final Benchmark Results:</h3> <blockquote> <p></p> </blockquote> <pre> MakeNiceString Took 139363 Ticks SplitCamelCase Took 489174 Ticks ImprovedMakeNiceString Took 115478 Ticks RefactoredMakeNiceString Took 38819 Ticks RefactoredForEachMakeNiceString Took 44700 Ticks CompiledRegex Took 227021 Ticks </pre> <p>Or, if you prefer milliseconds:</p> <pre> MakeNiceString Took 38 ms SplitCamelCase Took 123 ms ImprovedMakeNiceString Took 33 ms RefactoredMakeNiceString Took 11 ms RefactoredForEachMakeNiceString Took 12 ms CompiledRegex Took 63 ms </pre> <p>So the percentage gains are:</p> <pre> MakeNiceString 38 ms Baseline SplitCamelCase 123 ms 223% slower ImprovedMakeNiceString 33 ms 13.15% faster RefactoredMakeNiceString 11 ms 71.05% faster RefactoredForEachMakeNiceString 12 ms 68.42% faster CompiledRegex 63 ms 65.79% slower </pre> <p>(Please check my math)</p> <p>In the end, I'm going to replace what's there with the <code>RefactoredForEachMakeNiceString()</code> and while I'm at it, I'm going to rename it to something useful, like <code>SplitStringOnUpperCase</code>.</p> <h3>Benchmark Test:</h3> <p>To benchmark, I simply invoke a new <code>Stopwatch</code> for each method call:</p> <pre><code> string myString = "ThisIsAUpperCaseString"; Stopwatch sw = new Stopwatch(); sw.Start(); runTest(); sw.Stop(); static void runTest() { for (int i = 0; i &lt; 10000; i++) { MakeNiceString(myString); } } </code></pre> <hr> <h3>Questions</h3> <ul> <li>What causes these functions to be so different 'over the long haul', and </li> <li>How can I improve this function a) to be more maintainable or b) to run faster? </li> <li>How would I do memory benchmarks on these to see which used less memory?</li> </ul> <hr> <p><strong>Thank you for your responses thus far. I've inserted all of the suggestions made by @Jon Skeet, and would like feedback on the updated questions I've asked as a result.</strong></p> <blockquote> <p><strong>NB</strong>: This question is meant to explore ways to refactor string handling functions in C#. I copied/pasted the first code <code>as is</code>. I'm well aware you can remove the <code>System.Convert.ToString()</code> in the first method, and I did just that. If anyone is aware of any implications of removing the <code>System.Convert.ToString()</code>, that would also be helpful to know.</p> </blockquote>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload