Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>Since Java supports variable-length look-behinds (as long as they are finite), you could do do it like this:</p> <pre><code>import java.util.regex.*; public class RegexTest { public static void main(String[] argv) { Pattern p = Pattern.compile("(?&lt;=(?&lt;!\\\\)(?:\\\\\\\\){0,10}):"); String text = "foo:bar\\:baz\\\\:qux\\\\\\:quux\\\\\\\\:corge"; String[] parts = p.split(text); System.out.printf("Input string: %s\n", text); for (int i = 0; i &lt; parts.length; i++) { System.out.printf("Part %d: %s\n", i+1, parts[i]); } } } </code></pre> <ul> <li><code>(?&lt;=(?&lt;!\\)(?:\\\\){0,10})</code> looks behind for an even number of back-slashes (including zero, up to a maximum of 10).</li> </ul> <p>Output:</p> <blockquote> <p><code>Input string: foo:bar\:baz\\:qux\\\:quux\\\\:corge</code><br> <code>Part 1: foo</code><br> <code>Part 2: bar\:baz\\</code><br> <code>Part 3: qux\\\:quux\\\\</code><br> <code>Part 4: corge</code> </p> </blockquote> <p>Another way would be to match the parts themselves, instead of split at the delimiters.</p> <pre><code>Pattern p2 = Pattern.compile("(?&lt;=\\A|\\G:)((?:\\\\.|[^:\\\\])*)"); List&lt;String&gt; parts2 = new LinkedList&lt;String&gt;(); Matcher m = p2.matcher(text); while (m.find()) { parts2.add(m.group(1)); } </code></pre> <p>The strange syntax stems from that it need to handle the case of empty pieces at the start and end of the string. When a match spans exactly zero characters, the next attempt will start one character past the end of it. If it didn't, it would match another empty string, and another, ad infinitum&hellip;</p> <ul> <li><code>(?&lt;=\A|\G:)</code> will look behind for either the start of the string (the first piece), or the end of the previous match, followed by the separator. If we did <code>(?:\A|\G:)</code>, it would fail if the first piece is empty (input starts with a separator).</li> <li><code>\\.</code> matches any escaped character.</li> <li><code>[^:\\]</code> matches any character that is not in an escape sequence (because <code>\\.</code> consumed both of those).</li> <li><code>((?:\\.|[^:\\])*)</code> captures all characters up until the first non-escaped delimiter into capture-group 1.</li> </ul>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload