[Haskell-cafe] [Parsec] Backtracking with try does not work for me?

Chris Kuklewicz haskell at list.mightyreason.com
Mon Jul 31 13:51:27 EDT 2006

The semantics of Parsec's "optional" operation are what is causing the problem.

"optional foo" can have 3 results:
   1) foo can succeed, optional succeeds, proceed to next command
   2) foo can fail without consuming any input, optional succeeds proceed to 
next command
   3) foo can fail after consuming some input, optional fails, do not proceed

 > minilang = do
 >        char 'a'
 >        optional (do {comma ; char 'b'})

The comma in the above line consumes input even in the "a,c" case.  When "c" is 
seen the "char 'b'" fails and then the optional fails, and you get the error 
message you posted.

 >        optional (do {comma ; char 'c'})
 >        eof
 >        return "OK"

 > Apparently, "try" was used (do note that the column number indicates
 > that there was backtracking) but the parser still fails for
 > "a,c". Why?

Your next attempt does not fix the problem, since the try is in the wrong place 
( http://www.cs.uu.nl/~daan/download/parsec/parsec.html#try may help)

> minilang = do
>        char 'a'
>        try (optional (do {comma ; char 'b'}))

In the above line, the ",c" causes (char 'b') to fail, which causes 'optional' 
to fail, and then "try" also fails.  The "try" alters the stream so that the 
"comma" was not consumed, but the "try" still passes along the failure.

In neither the original or the modified minilang does the 'char "c"' line ever 
get reached in the "a,c" input case.

The working solution is a small tweak:

> minilang = do
>        char 'a'
>        optional (try (do {comma ; char 'b'}))
>        optional (do {comma ; char 'c'})
>        eof
>        return "OK"

Now the "a,c" case causes the (char 'b') to fail, and then the "try" also fails, 
but also acts as if the comma had not been consumed.  Thus we are in case #2 of 
the semantics of "optional" and so "optional" succeeds instead of failing, 
allowing the next line to parse ",c" then eof then return "OK".

There is a very very important difference to Parsec between failing with and 
without having consumed input.  It means Parsec can be more efficient, since any 
branch that consumes input cannot backtrack.  The "try" command is a way to 
override this optimization and allow more backtracking.

The other solution presented on this list was:

> minilang = do
>        char 'a'
>        try b <|> (return '-')
>        optional c
>        eof
>        return "OK"
>   where
>   b = do { comma ; char 'b' }
>   c = do { comma ; char 'c' }

In this case, the "optional" was replace by (<|> (return '-')). In fact you 
could define optional this way:

> optional :: GenParser tok st a -> GenParser tok st ()
> optional foo = (foo >> return ()) <|> (return ())

Thus "optional (try b)" is actually the same as "(b >> return ()) <|> (return 
())".  So you can see my suggestion is really identical the previous one.

I could not help generalizing your toy problem to an ordered list of comma 
separated Char.  Note that "try" is not actually needed in listlang, but it 
would be if (char x) were replaced by something that can consume more than a 
single character:

> listlang :: [Char] -> GenParser Char st [Char]
> listlang [] = eof >> return []
> listlang (x:xs) = useX <|> listlang xs
>   where useX = do try (char x)
>                   rest <- end <|> more
>                   return (x:rest)
>         end = (eof >> return [])
>         more = comma >> listlang xs

Now minilang (the fixed version) is the same as (listlang ['a','b','c']) or 
(listlang "abc").  This is a good example:

> *Main> run (listlang "abcd") "c,b"
> parse error at (line 1, column 3):
> unexpected "b"
> expecting "d" or end of input


More information about the Haskell-Cafe mailing list