Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Ragel works fine. You just need to be careful about what you're matching. Your question uses both <code>[[tag]]</code> and <code>{{tag}}</code>, but your example uses <code>[[tag]]</code>, so I figure that's what you're trying to treat as special.</p> <p>What you want to do is eat text until you hit an open-bracket. If that bracket is followed by another bracket, then it's time to start eating lowercase characters till you hit a close-bracket. Since the text in the tag cannot include any bracket, you know that the only non-error character that can follow that close-bracket is another close-bracket. At that point, you're back where you started.</p> <p>Well, that's a verbatim description of this machine:</p> <pre><code>tag = '[[' lower+ ']]'; main := ( (any - '[')* # eat text ('[' ^'[' | tag) # try to eat a tag )*; </code></pre> <p>The tricky part is, where do you call your actions? I don't claim to have the best answer to that, but here's what I came up with:</p> <pre><code>static char *text_start; %%{ machine parser; action MarkStart { text_start = fpc; } action PrintTextNode { int text_len = fpc - text_start; if (text_len &gt; 0) { printf("TEXT(%.*s)\n", text_len, text_start); } } action PrintTagNode { int text_len = fpc - text_start - 1; /* drop closing bracket */ printf("TAG(%.*s)\n", text_len, text_start); } tag = '[[' (lower+ &gt;MarkStart) ']]' @PrintTagNode; main := ( (any - '[')* &gt;MarkStart %PrintTextNode ('[' ^'[' %PrintTextNode | tag) &gt;MarkStart )* @eof(PrintTextNode); }%% </code></pre> <p>There are a few non-obvious things:</p> <ul> <li>The <code>eof</code> action is needed because <code>%PrintTextNode</code> is only ever invoked on leaving a machine. If the input ends with normal text, there will be no input to make it leave that state. Because it will also be called when the input ends with a tag, and there is no final, unprinted text node, <code>PrintTextNode</code> tests that it has some text to print.</li> <li>The <code>%PrintTextNode</code> action nestled in after the <code>^'['</code> is needed because, though we marked the start when we hit the <code>[</code>, after we hit a non-<code>[</code>, we'll start trying to parse anything again and remark the start point. We need to flush those two characters before that happens, hence that action invocation.</li> </ul> <p>The full parser follows. I did it in C because that's what I know, but you should be able to turn it into whatever language you need pretty readily:</p> <pre><code>/* ragel so_tag.rl &amp;&amp; gcc so_tag.c -o so_tag */ #include &lt;stdio.h&gt; #include &lt;string.h&gt; static char *text_start; %%{ machine parser; action MarkStart { text_start = fpc; } action PrintTextNode { int text_len = fpc - text_start; if (text_len &gt; 0) { printf("TEXT(%.*s)\n", text_len, text_start); } } action PrintTagNode { int text_len = fpc - text_start - 1; /* drop closing bracket */ printf("TAG(%.*s)\n", text_len, text_start); } tag = '[[' (lower+ &gt;MarkStart) ']]' @PrintTagNode; main := ( (any - '[')* &gt;MarkStart %PrintTextNode ('[' ^'[' %PrintTextNode | tag) &gt;MarkStart )* @eof(PrintTextNode); }%% %% write data; int main(void) { char buffer[4096]; int cs; char *p = NULL; char *pe = NULL; char *eof = NULL; %% write init; do { size_t nread = fread(buffer, 1, sizeof(buffer), stdin); p = buffer; pe = p + nread; if (nread &lt; sizeof(buffer) &amp;&amp; feof(stdin)) eof = pe; %% write exec; if (eof || cs == %%{ write error; }%%) break; } while (1); return 0; } </code></pre> <p>Here's some test input:</p> <pre><code>[[header]] &lt;html&gt; &lt;head&gt;&lt;title&gt;title&lt;/title&gt;&lt;/head&gt; &lt;body&gt; &lt;h1&gt;[[headertext]]&lt;/h1&gt; &lt;p&gt;I am feeling very [[emotion]].&lt;/p&gt; &lt;p&gt;I like brackets: [ is cool. ] is cool. [] are cool. But [[tag]] is special.&lt;/p&gt; &lt;/body&gt; &lt;/html&gt; [[footer]] </code></pre> <p>And here's the output from the parser:</p> <pre><code>TAG(header) TEXT( &lt;html&gt; &lt;head&gt;&lt;title&gt;title&lt;/title&gt;&lt;/head&gt; &lt;body&gt; &lt;h1&gt;) TAG(headertext) TEXT(&lt;/h1&gt; &lt;p&gt;I am feeling very ) TAG(emotion) TEXT(.&lt;/p&gt; &lt;p&gt;I like brackets: ) TEXT([ ) TEXT(is cool. ] is cool. ) TEXT([]) TEXT( are cool. But ) TAG(tag) TEXT( is special.&lt;/p&gt; &lt;/body&gt; &lt;/html&gt; ) TAG(footer) TEXT( ) </code></pre> <p>The final text node contains only the newline at the end of the file.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload