[Haskell-cafe] Parsers are monadic?
Claus Reinke
claus.reinke at talk21.com
Sat Jun 30 13:11:51 EDT 2007
> Have you used Parsec?
i read about it when it came out, but i've always defined my own
combinators. in case you wonder, there are two reasons for this:
(a) the approximation of parsers as monads is close enough that a simple
type Parser m a = StateT String m a
gives us the basic combinators (sequence,success,failure,alternative) for free.
(b) i like my combinator grammars to be reversible, so that a single grammar
specification can be used for both parsing and unparsing/pretty-printing.
that means i have to define the details myself anyway.
about the only thing that spoils this nice setup is error handling. i've done
some of the things that parsec does myself, such as labeling productions,
keeping positions and unexpected input, and merging of error info from
branches, but it just doesn't seem to fit easily: what starts out as an
elegant reuse of Monad/MonadPlus/StateT soon looks rather involved
(compare also the implementation description in the parsec paper).
parsec is a bit more systematic about these things, so it is easier to
see where it is headed: a three-valued logic.
if one has successful parses, "successful" errors, and failure to produce
either as the third outcome, then one can try to merge the success and
failure continuations into the single continuation provided by the monadic
(>>=). still not very natural/simple, as one always has to deal with both
continuations at once, the same way, but perhaps workable.
> Negation pretty much just works.
please keep in mind that i was asking for general monadic negation.
if one introduces a bias towards the first successful parse (as parsec
does, apart from longest-match), one can model negation as failure:
not m = (m >> return False) `mplus` return True
this only uses Monad and MonadPlus, but it only works as expected
for some MonadPlus (those which commit locally to the first success,
instead of collecting multiple successes, such as lists, or searching for
global success, such as amb, or longest-match parsers).
alternatively, one could require an overloaded 'first' method, selecting
the first successful branch in an ordered collection of solutions:
not' m = first $ (m >> return False) `mplus` return True
but that wouldn't work for unordered collection monads.
in other words, one can define monadic negation for some specific
monads, but not for all monads. but coding implementation knowledge
about any specific monad into the source makes the code less general
than it could be.
so i was wondering whether there should be a class such as
MonadChoice (committed choice instead of collections; every
MonadChoice gives a MonadPlus, but not vice-versa), or
MonadFirst (giving access to the first success), or MonadNegate
(providing monadic not by whatever means). then code depending
on monadic negation could still be reasonably general (not working
with arbitrary monads, but also not restricted to a specific one).
btw, in an approach with two continuations, negation is as simple
as switching the success and failure continuations. but the merging
of success and failure branches is perhaps no less involved than
what one gets when implementing the two-kinds-of-success
approach.
i was just hoping that someone had come up with something more
elegant, and i was wondering whether there were other issues that
people have to work around again and again.
claus
More information about the Haskell-Cafe
mailing list