Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>The problem is that this is rather awkward to match with the sort of regular expressions used in <code>lex</code>; if you had a system that supported EREs, then you'd be able to do either:</p> <pre><code>&lt;!\[CDATA\[(.*?)\]\]&gt; </code></pre> <p>or</p> <pre><code>&lt;!\[CDATA\[((?:[^]]|\](?!\]&gt;))*)\]\]&gt; </code></pre> <p>(The first uses non-greedy quantifiers, the second uses negative lookahead constraints. OK, it uses non-capturing parens too, but you can use capturing ones there instead; that's not so important.)</p> <p>It's probably easier to handle this by using a similar strategy to the way C-style comments are handled in <code>lex</code>, by having one rule to match the start of the CDATA (on <code>&lt;![CDATA[</code>) and put the lexer into a separate state that it leaves on seeing <code>]]&gt;</code>, while collecting all the characters in-between. <a href="http://www.cs.man.ac.uk/~pjj/cs212/ex2_str_comm.html" rel="nofollow">This</a> is instructive on the topic (and it seems that this is an area where <code>flex</code> and <code>lex</code> differ) and it covers all the strategies that you can take to make this work.</p> <p>Note that cause of all these problems are because it's very difficult to write a rule with simple regular expressions that expresses the fact that a greedy regular expression must only match a <code>]</code> if it is not followed by <code>]&gt;</code>. It's much easier to do if you've only got a two-character (or single character!) end-of-interesting-section sequence because you don't need such an elaborate state machine.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload