[Haskell-beginners] Parsing 'A's and then ('A's or 'B's)
Francesco Ariis
fa-ml at ariis.it
Mon Jan 25 09:15:13 UTC 2016
On Sun, Jan 24, 2016 at 10:56:31PM +0100, Simon Jakobi wrote:
> > if I understood your specification correctly, there would be multiple
> > ways to parse the string "AAA":
> >
> > - 3 'x' elements ("A", "A", "A")
> > - 2 'x' elements ("AA", "A")
> > - 2 'x' elements again (first one shorter) ("A", "AA")
> > - 1 'x' element ("AAA")
> >
>
> There would be even more ways because 'y', too, can represent one or more
> 'A's.
>
> [...]
>
> Is this due to attoparsec not being able to "backtrack" (not sure if
> this is the right term)? Is backtracking something that parsers generally
> are incapable of?
Ah, indeed you are right. attoparsec, parsec and friends handle failure
with `try`. From Attoparsec documentation:
Attempt a parse, but do not consume any input if the parse fails.
One way to deal with cases like yours is for every parser to compute a
"list of successes". Crude example:
import Text.Parsec
import Text.Parsec.String
foo :: Parser [String]
foo = anyChar >>= \h ->
(foo <|> e) >>= \t ->
return ([""] ++ map (h:) t)
where
e = return [""]
-- λ> parseTest foo "bar"
-- ["","b","ba","bar"]
Then you can chain those with `try`/`choice` and compute your result(s)
(I guess using the list monad to handle the mechanism could do).
Ambiguous grammars are an age old problem, and some searching [1] leads
me to believe there are already viable solution in Haskell.
[1] http://stackoverflow.com/questions/13279087/parser-library-that-can-handle-ambiguity
More information about the Beginners
mailing list