Note that there are some explanatory texts on larger screens.

plurals
  1. PORegex to strip line comments from C#
    primarykey
    data
    text
    <p>I'm working on a routine to strip block <em>or</em> line comments from some C# code. I have looked at the other examples on the site, but haven't found the <em>exact</em> answer that I'm looking for.</p> <p>I can match block comments (/* comment */) in their entirety using this regular expression with RegexOptions.Singleline:</p> <p><code>(/\*[\w\W]*\*/)</code></p> <p>And I can match line comments (// comment) in their entirety using this regular expression with RegexOptions.Multiline:</p> <p><code>(//((?!\*/).)*)(?!\*/)[^\r\n]</code></p> <p><em>Note: I'm using <code>[^\r\n]</code> instead of <code>$</code> because <code>$</code> is including <code>\r</code> in the match, too.</em></p> <p>However, this doesn't <em>quite</em> work the way I want it to.</p> <p>Here is my test code that I'm matching against:</p> <pre><code>// remove whole line comments bool broken = false; // remove partial line comments if (broken == true) { return "BROKEN"; } /* remove block comments else { return "FIXED"; } // do not remove nested comments */ bool working = !broken; return "NO COMMENT"; </code></pre> <p>The block expression matches </p> <pre><code>/* remove block comments else { return "FIXED"; } // do not remove nested comments */ </code></pre> <p>which is fine and good, but the line expression matches</p> <pre><code>// remove whole line comments // remove partial line comments </code></pre> <p><em>and</em></p> <pre><code>// do not remove nested comments </code></pre> <p>Also, if I do not have the */ positive lookahead in the line expression twice, it matches</p> <pre><code>// do not remove nested comments * </code></pre> <p>which I <em>really</em> don't want.</p> <p>What I want is an expression that will match characters, starting with <code>//</code>, to the end of line, but does <em>not</em> contain <code>*/</code> between the <code>//</code> and end of line.</p> <p>Also, just to satisfy my curiosity, can anyone explain why I need the lookahead twice? <code>(//((?!\*/).)*)[^\r\n]</code> and <code>(//(.)*)(?!\*/)[^\r\n]</code> will both include the *, but <code>(//((?!\*/).)*)(?!\*/)[^\r\n]</code> and <code>(//((?!\*/).)*(?!\*/))[^\r\n]</code> won't.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload