[Haskell-cafe] [Parsec] Backtracking with try does not work for
me?
Chris Kuklewicz
haskell at list.mightyreason.com
Mon Jul 31 13:51:27 EDT 2006
The semantics of Parsec's "optional" operation are what is causing the problem.
"optional foo" can have 3 results:
1) foo can succeed, optional succeeds, proceed to next command
2) foo can fail without consuming any input, optional succeeds proceed to
next command
3) foo can fail after consuming some input, optional fails, do not proceed
> minilang = do
> char 'a'
> optional (do {comma ; char 'b'})
The comma in the above line consumes input even in the "a,c" case. When "c" is
seen the "char 'b'" fails and then the optional fails, and you get the error
message you posted.
> optional (do {comma ; char 'c'})
> eof
> return "OK"
> Apparently, "try" was used (do note that the column number indicates
> that there was backtracking) but the parser still fails for
> "a,c". Why?
Your next attempt does not fix the problem, since the try is in the wrong place
( http://www.cs.uu.nl/~daan/download/parsec/parsec.html#try may help)
> minilang = do
> char 'a'
> try (optional (do {comma ; char 'b'}))
In the above line, the ",c" causes (char 'b') to fail, which causes 'optional'
to fail, and then "try" also fails. The "try" alters the stream so that the
"comma" was not consumed, but the "try" still passes along the failure.
In neither the original or the modified minilang does the 'char "c"' line ever
get reached in the "a,c" input case.
The working solution is a small tweak:
> minilang = do
> char 'a'
> optional (try (do {comma ; char 'b'}))
> optional (do {comma ; char 'c'})
> eof
> return "OK"
Now the "a,c" case causes the (char 'b') to fail, and then the "try" also fails,
but also acts as if the comma had not been consumed. Thus we are in case #2 of
the semantics of "optional" and so "optional" succeeds instead of failing,
allowing the next line to parse ",c" then eof then return "OK".
There is a very very important difference to Parsec between failing with and
without having consumed input. It means Parsec can be more efficient, since any
branch that consumes input cannot backtrack. The "try" command is a way to
override this optimization and allow more backtracking.
The other solution presented on this list was:
> minilang = do
> char 'a'
> try b <|> (return '-')
> optional c
> eof
> return "OK"
> where
> b = do { comma ; char 'b' }
> c = do { comma ; char 'c' }
In this case, the "optional" was replace by (<|> (return '-')). In fact you
could define optional this way:
> optional :: GenParser tok st a -> GenParser tok st ()
> optional foo = (foo >> return ()) <|> (return ())
Thus "optional (try b)" is actually the same as "(b >> return ()) <|> (return
())". So you can see my suggestion is really identical the previous one.
I could not help generalizing your toy problem to an ordered list of comma
separated Char. Note that "try" is not actually needed in listlang, but it
would be if (char x) were replaced by something that can consume more than a
single character:
> listlang :: [Char] -> GenParser Char st [Char]
> listlang [] = eof >> return []
> listlang (x:xs) = useX <|> listlang xs
> where useX = do try (char x)
> rest <- end <|> more
> return (x:rest)
> end = (eof >> return [])
> more = comma >> listlang xs
Now minilang (the fixed version) is the same as (listlang ['a','b','c']) or
(listlang "abc"). This is a good example:
> *Main> run (listlang "abcd") "c,b"
> parse error at (line 1, column 3):
> unexpected "b"
> expecting "d" or end of input
--
Chris
More information about the Haskell-Cafe
mailing list