[Haskell-cafe] Parser left recursion

Wed Feb 20 08:13:16 CET 2013

Hello all,

this was previously asked on haskell-beginners, but only partially answered.

As an exercise I am writing a parser roughly following the expamples in Graham 
Hutton's book. The language contains things like:

data Exp = Lit Int -- literal integer
         | Plus Exp Exp

My naive parser enters an infinite recursion, when I try to parse "1+2". I do 
understand why:

"hmm, this expression could be a plus, but then it must start with an 
expression, lets check". 

and it tries to parse expression again and again considers Plus.

Twan van Laarhoven told me that:

> Left-recursion is always a problem for recursive-descend parsers.

and suggested to do something like:

>     parseExp = do
>       lit <- parseLit
>       pluses <- many (parsePlusToken *> parseLit)
>       return (combinePlusesWithLit lit pluses)
>
>     combinePlusesWithLit = foldr Plus -- or foldl

This indeed does the trick, but only when the first token is a Lit (literal 
integer). 

I then added the possibility to optionally put things in parentheses. But then  
I cannot parse "(1+2)+3". The original code fails, because "(1+2)" is not a 
Lit and when I allow an expression as the first argument to "+" I get infinite 
recursion again.

I am generally confused, that saying "a plus expression is an integer followed 
by many "plus somethings" is not what the language says. So this requires a 
lot of "paying attention" to get right. I'd much rather say "a plus expression 
is two expressions with a '+' in between".

I do know for sure, that it is possible to parse "(1+2)+3" (ghci does it just 
fine). But I seem to be missing a trick.

Can anyone shed some light on this?

-- 
Martin