Note that there are some explanatory texts on larger screens.

plurals
  1. POHow can we match a^n b^n with Java regex?
    text
    copied!<blockquote> <p><sub><em>This is the second part of a series of educational regex articles. It shows how lookaheads and nested references can be used to match the non-regular languge a<sup>n</sup>b<sup>n</sup>. Nested references are first introduced in: <a href="https://stackoverflow.com/questions/3627681/how-does-this-regex-find-triangular-numbers/">How does this regex find triangular numbers?</a></em></sub></p> </blockquote> <p>One of the archetypal non-<a href="http://en.wikipedia.org/wiki/Regular_language" rel="noreferrer">regular languages</a> is:</p> <blockquote> <p><code>L = { a</code><sup>n</sup><code>b</code><sup>n</sup><code>: n &gt; 0 }</code> </p> </blockquote> <p>This is the language of all non-empty strings consisting of some number of <code>a</code>'s followed by an equal number of <code>b</code>'s. Examples of strings in this language are <code>ab</code>, <code>aabb</code>, <code>aaabbb</code>.</p> <p>This language can be show to be non-regular by the <a href="http://en.wikipedia.org/wiki/Pumping_lemma_for_regular_languages" rel="noreferrer">pumping lemma</a>. It is in fact an archetypal <a href="http://en.wikipedia.org/wiki/Context-free_language" rel="noreferrer">context-free language</a>, which can be generated by the <a href="http://en.wikipedia.org/wiki/Context-free_language" rel="noreferrer">context-free grammar</a> <code>S → aSb | ab</code>.</p> <p>Nonetheless, modern day regex implementations clearly recognize more than just regular languages. That is, they are not "regular" by formal language theory definition. PCRE and Perl supports recursive regex, and .NET supports balancing groups definition. Even less "fancy" features, e.g. backreference matching, means that regex is not regular.</p> <p>But just how powerful is this "basic" features? Can we recognize <code>L</code> with Java regex, for example? Can we perhaps combine lookarounds and nested references and have a pattern that works with e.g. <a href="http://download.oracle.com/javase/6/docs/api/java/lang/String.html#matches%28java.lang.String%29" rel="noreferrer"><code>String.matches</code></a> to match strings like <code>ab</code>, <code>aabb</code>, <code>aaabbb</code>, etc?</p> <h3>References</h3> <ul> <li><a href="http://perldoc.perl.org/perlfaq6.html#Can-I-use-Perl-regular-expressions-to-match-balanced-text?" rel="noreferrer">perlfaq6: Can I use Perl regular expressions to match balanced text?</a></li> <li><a href="http://msdn.microsoft.com/en-us/library/bs2twtah.aspx#balancing_group_definition" rel="noreferrer">MSDN - Regular Expression Language Elements - Balancing Group Definitions</a></li> <li><a href="http://www.pcre.org/pcre.txt" rel="noreferrer">pcre.org - PCRE man page</a></li> <li><a href="http://www.regular-expressions.info/" rel="noreferrer">regular-expressions.info</a> - <a href="http://www.regular-expressions.info/lookaround.html" rel="noreferrer">Lookarounds</a> and <a href="http://www.regular-expressions.info/brackets.html" rel="noreferrer">Grouping and Backreferences</a></li> <li><a href="http://java.sun.com/javase/6/docs/api/java/util/regex/Pattern.html" rel="noreferrer"><code>java.util.regex.Pattern</code></a></li> </ul> <h3>Linked questions</h3> <ul> <li><a href="https://stackoverflow.com/questions/2974210/does-lookaround-affect-which-languages-can-be-matched-by-regular-expressions">Does lookaround affect which languages can be matched by regular expressions?</a></li> <li><a href="https://stackoverflow.com/questions/3349999/net-regex-recursive-patterns">.NET Regex Balancing Groups vs PCRE Recursive Patterns</a></li> </ul>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload