[Haskell-cafe] Parsec-like parser combinator that handles left
oleg at okmij.org
oleg at okmij.org
Thu Dec 10 02:16:34 EST 2009
There are at least two parser combinator libraries that can deal with
*any* left-recursive grammars. That said, Prof. Swierstra's advice to
try to get rid of left recursion is still well worth to follow.
The first library is described in
Frost, Richard, Hafiz, Rahmatullah, and Callaghan, Paul.
Parser Combinators for Ambiguous Left-Recursive Grammars. PADL2008.
It handles arbitrary left-recursive grammars, including eps-cycles.
It copes with highly ambiguous grammars. It can produce all parses of
the input sequence. It parses the input in O(n^4) time with respect to
the size of the input.
I have tried dealing with left-recursive grammars and too wrote a parser
It can handle eps-cycles, ambiguity and other pathology. Here is a
sample bad grammar:
S -> S A C | C
A -> B | aCa
B -> B
C -> b | C A
The library is pure and uses no state (monad). Strictly speaking, the
library produces a recognizer rather than a parser, or a recognizer
that yields a list of fired productions. There are standard ways
however to `lift' a recognizer to produce a parse tree. The library is
far simpler than than of Frost et al, and does not require explicit
labeling of productions. Although the library can be faster than
Frost's et al when a parse exists, it currently can be far slower when
no parse exists. (It still assuredly terminates.)
Both libraries require that the whole input be known in advance: neither
library does on-line parsing. The reason has to do with memoization
and with the cut-off for the recursion (a productive left-recursive
rule shouldn't be applied more times than there are characters in the
input). That is quite a disappointment.
Therefore, it is well worth to convert the left recursion away, as was
described in Doaitse Swierstra's earlier message.
More information about the Haskell-Cafe