Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>As others have commented, regular expressions may not be suitable for a bullet-proof method. E.g. using regex, it would be difficult to check if the <code>&lt;title&gt;</code> tag were part of a quoted string within the HTML. That's a recurring response on StackOverflow for questions like this. But personally, I think you've got a point that a parser would be overkill for such a simple extraction. If you're looking for a method that works <em>most</em> of the time, one of the following should surfice.</p> <p><strong>Option 1: Lookbehind / lookahead</strong></p> <pre><code>(?&lt;=&lt;title[\s\n]*&gt;[\s\n]*)(.(?![\s\n]*&lt;/title[\s\n]*&gt;))* </code></pre> <p>This uses <a href="http://www.regular-expressions.info/lookaround.html" rel="nofollow noreferrer">lookbehind and lookahead</a> for the tags - .NET has a sophisticated regex engine that allows for infinite repetition so you can even check for whitespace/return characters between the tag name and end brace (see <a href="https://stackoverflow.com/questions/3314535/white-space-inside-xml-html-tags#3314572">this answer</a>).</p> <p><strong>Option 2: Capturing group</strong></p> <pre><code>&lt;title[\s\n]*&gt;[\s\n]*(.*)[\s\n]*&lt;/title[\s\n]*&gt; </code></pre> <p>Similar but slightly simpler - the whole regex match includes the start and end tags. The first (and only) capturing group <code>(.*)</code> captures the bit that is of interest in between.</p> <p>Visualisation: <img src="https://www.debuggex.com/i/nTVcC3ltpInvzX3W.png" alt="Regular expression visualization"></p> <p><a href="http://www.debuggex.com/r/nTVcC3ltpInvzX3W" rel="nofollow noreferrer">Edit live on Debuggex</a></p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload