Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <h3>Crazy solution</h3> <p>A spartan regex that tokenizes and also <strong>validates</strong> all the tokens that it extracts:</p> <pre><code>\G\s*+((\((?:\s*+(?2)\s*+(?(?!\)),)|\s*+[^()',\s]++\s*+(?(?!\)),)|\s*+'[^'\r\n]*+'\s*+(?(?!\)),))++\))|[^()',\s]++|'[^'\r\n]*+')\s*+(?:,|$) </code></pre> <p><a href="http://www.regex101.com/r/zQ4pE8" rel="nofollow"><strong>Regex101</strong></a> </p> <p>Put it in string literal, with delimiter:</p> <pre><code>'/\G\s*+((\((?:\s*+(?2)\s*+(?(?!\)),)|\s*+[^()\',\s]++\s*+(?(?!\)),)|\s*+\'[^\'\r\n]*+\'\s*+(?(?!\)),))++\))|[^()\',\s]++|\'[^\'\r\n]*+\')\s*+(?:,|$)/' </code></pre> <p><a href="http://ideone.com/vNHBKQ" rel="nofollow"><strong>ideone</strong></a></p> <p>The result is in capturing group 1. In the example on ideone, I specify <code>PREG_OFFSET_CAPTURE</code> flag, so that you can check against the <em>last match</em> in group 0 (entire match) whether the entire source string has been consumed or not.</p> <h3>Assumptions</h3> <ul> <li>Non-quoted text may not contain any whitespace character, as defined by <code>\s</code>. Consequently, it may not span multiple lines.</li> <li>Non-quoted text may not contain <code>(</code>, <code>)</code>, <code>'</code> or <code>,</code>.</li> <li>Non-quoted text must contain at least 1 character.</li> <li>Single quoted text may not span multiple lines.</li> <li>Single quoted text may not contain quote. Consequently, there is no way to specify <code>'</code>.</li> <li>Single quoted text may be empty.</li> <li>Bracket token contains one or more of the following as sub-tokens: non-quoted text token, single quoted text token, or another bracket token.</li> <li>In bracket token, 2 adjacent sub-tokens are separated by exactly one <code>,</code></li> <li>Bracket token starts with <code>(</code> and ends with <code>)</code>.</li> <li>Consequently, a bracket token must have balanced brackets, and empty bracket <code>()</code> is not allowed.</li> <li>Input will contain one or more of: non-quoted text, single quoted text or bracket token. The tokens in the input are separated with comma <code>,</code>. Single trailing comma <code>,</code> is considered valid.</li> <li>Whitespace character (as defined by <code>\s</code>, which includes new line character) are arbitrarily allowed between token(s), comma(s) <code>,</code> separating tokens, and the bracket(s) <code>(</code>, <code>)</code> of the bracket tokens.</li> </ul> <h3>Breakdown</h3> <pre> \G\s*+ ( ( \( (?: \s*+ (?2) \s*+ (?(?!\)),) | \s*+ [^()',\s]++ \s*+ (?(?!\)),) | \s*+ '[^'\r\n]*+' \s*+ (?(?!\)),) )++ \) ) | [^()',\s]++ | '[^'\r\n]*+' ) \s*+(?:,|$) </pre>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload