StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
text
Body
copied!<p>LL is usually a more efficient parsing technique than recursive-descent. In fact, a naive recursive-descent parser will actually be <em>O(k^n)</em> (where <em>n</em> is the input size) in the worst case. Some techniques such as memoization (which yields a <a href="http://en.wikipedia.org/wiki/Packrat_parsing#Implementing_parsers_from_parsing_expression_grammars" rel="noreferrer">Packrat</a> parser) can improve this as well as extend the class of grammars accepted by the parser, but there is always a space tradeoff. LL parsers are (to my knowledge) always linear time.</p> <p>On the flip side, you are correct in your intuition that recursive-descent parsers can handle a greater class of grammars than LL. Recursive-descent can handle any grammar which is LL(*) (that is, <em>unlimited</em> lookahead) as well as a small set of ambiguous grammars. This is because recursive-descent is actually a directly-encoded implementation of PEGs, or <a href="http://en.wikipedia.org/wiki/Packrat_parsing" rel="noreferrer">Parser Expression Grammar(s)</a>. Specifically, the disjunctive operator (<code>a | b</code>) is not commutative, meaning that <code>a | b</code> does not equal <code>b | a</code>. A recursive-descent parser will try each alternative in order. So if <code>a</code> matches the input, it will succede even if <code>b</code> <em>would have</em> matched the input. This allows classic "longest match" ambiguities like the dangling <code>else</code> problem to be handled simply by ordering disjunctions correctly.</p> <p>With all of that said, it is <em>possible</em> to implement an LL(k) parser using recursive-descent so that it runs in linear time. This is done by essentially inlining the predict sets so that each parse routine determines the appropriate production for a given input in constant time. Unfortunately, such a technique eliminates an entire class of grammars from being handled. Once we get into predictive parsing, problems like dangling <code>else</code> are no longer solvable with such ease.</p> <p>As for why LL would be chosen over recursive-descent, it's mainly a question of efficiency and maintainability. Recursive-descent parsers are markedly easier to implement, but they're usually harder to maintain since the grammar they represent does not exist in any declarative form. Most non-trivial parser use-cases employ a parser generator such as ANTLR or Bison. With such tools, it really doesn't matter if the algorithm is directly-encoded recursive-descent or table-driven LL(k).</p> <p>As a matter of interest, it is also worth looking into <a href="http://en.wikipedia.org/wiki/Recursive_ascent" rel="noreferrer">recursive-ascent</a>, which is a parsing algorithm directly encoded after the fashion of recursive-descent, but capable of handling any LALR grammar. I would also dig into <a href="http://en.wikipedia.org/wiki/Parser_combinators" rel="noreferrer">parser combinators</a>, which are a functional way of composing recursive-descent parsers together.</p>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload