Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>What you're writing is called a pushdown automaton. This is usually more power than you need to write a lexer, it's certainly excessive if you're writing a lexer for a modern language like CSS. A recursive descent parser is close in power to a pushdown automaton, but recursive descent parsers are much easier to write and to understand. Most parser generators generate pushdown automatons.</p> <p>Lexers are almost always written as finite state machines, i.e., like your code except get rid of the "stack" object. Finite state machines are closely related to regular expressions (actually, they're provably equivalent to one another). When designing such a parser, one usually starts with the regular expressions and uses them to create a deterministic finite automaton, with some extra code in the transitions to record the beginning and end of each token.</p> <p>There are tools to do this. The <code>lex</code> tool and its descendants are well known and have been translated into many languages. The <code>ANTLR</code> toolchain also has a lexer component. My preferred tool is <code>ragel</code> on platforms that support it. There is little benefit to writing a lexer by hand <em>most</em> of the time, and the code generated by these tools will probably be faster and more reliable.</p> <p>If you do want to write your own lexer by hand, good ones often look something like this:</p> <pre><code>function readToken() // note: returns only one token each time while !eof c = peekChar() if c in A-Za-z return readIdentifier() else if c in 0-9 return readInteger() else if c in ' \n\r\t\v\f' nextChar() ... return EOF function readIdentifier() ident = "" while !eof c = nextChar() if c in A-Za-z0-9 ident.append(c) else return Token(Identifier, ident) // or maybe... return Identifier(ident) </code></pre> <p>Then you can write your parser as a recursive descent parser. Don't try to combine lexer / parser stages into one, it leads to a total mess of code. (According to the Parsec author, it's slower, too).</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload