Note that there are some explanatory texts on larger screens.

plurals
  1. POParsing optional semicolon at statement end
    primarykey
    data
    text
    <p>I was writing a parser to parse C-like grammars.</p> <p>First, it could now parse code like:</p> <pre><code>a = 1; b = 2; </code></pre> <p>Now I want to make the semicolon at the end of line optional.</p> <p>The original YACC rule was:</p> <pre><code>stmt: expr ';' { ... } </code></pre> <p>Where the new line is processed by the lexer that written by myself(the code are simplified):</p> <pre><code>rule(/\r\n|\r|\n/) { increase_lineno(); return :PASS } </code></pre> <p>the instruction :PASS here is equivalent to return nothing in LEX, which drop current matched text and skip to the next rule, just like what is usually done with whitespaces.</p> <p>Because of this, I can't just simply change my YACC rule into:</p> <pre><code>stmt: expr end_of_stmt { ... } ; end_of_stmt: ';' | '\n' ; </code></pre> <p>So I chose to change the lexer's state dynamically by the parser correspondingly.</p> <p>Like this:</p> <pre><code>stmt: expr { state = :STATEMENT_END } ';' { ... } </code></pre> <p>And add a lexer rule that can match new line with the new state:</p> <pre><code>rule(/\r\n|\r|\n/, :STATEMENT_END) { increase_lineno(); state = nil; return ';' } </code></pre> <p>Which means when the lexer is under :STATEMENT_END state. it will first increase the line number as usual, and then set the state into initial one, and then pretend itself is a semicolon.</p> <p>It's strange that it doesn't actually work with following code:</p> <pre><code>a = 1 b = 2 </code></pre> <p>I debugged it and got it is not actually get a ';' as expect when scanned the newline after the number 1, and the state specified rule is not really executed.</p> <p>And the code to set the new state is executed after it already scanned the new line and returned nothing, that means, these works is done as following order:</p> <ol> <li>scan <code>a</code>, <code>=</code> and <code>1</code></li> <li>scan newline and skip, so get the next value <code>b</code></li> <li>the inserted code(<code>{ state = :STATEMENT_END }</code>) is executed</li> <li>raising error -- unexpected <code>b</code> here</li> </ol> <p>This is what I expect:</p> <ol> <li>scan <code>a</code>, <code>=</code> and <code>1</code></li> <li>found that it matches the rule <code>expr</code>, so reduce into <code>stmt</code></li> <li>execute the inserted code to set the new lexer state</li> <li>scan the newline and return a <code>;</code> according the new state matching rule</li> <li>continue to scan &amp; parse the following line</li> </ol> <p>After introspection I found that might caused as YACC uses LALR(1), this parser will read forward for one token first. When it scans to there, the state is not set yet, so it cannot get a correct token.</p> <p>My question is: how to make it work as expected? I have no idea on this.</p> <p>Thanks.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload