[Haskell-cafe] Re: Optimizing Parsec 3 -- was: Wiki software?

Tue Dec 15 17:04:50 EST 2009

On Tue, Dec 15, 2009 at 3:13 PM, Bryan O'Sullivan <bos at serpentine.com> wrote:
> Besides the performance issue, are
> there any other considerations keeping it from becoming the default?

One thing that makes me a bit hesitant is that it's a pretty big
change to the core parser data structure, to the extent that I'm not
sure I should even call it "Parsec."

Reading through the Parsec technical report, one of the innovations
that made Parsec what it was is that it introduced a new way of
returning the four possible parse results

Previous work with parsers returning good error message indicated
parse reults with the following sort of data structure:

data ParseResult s a
  = EmptyOk s a    -- parsed ok but did not consume any input
   | EmptyError ErrorMessage -- did not parse okay, but did not
consume any input
   | ConsumedOk s a -- parsed ok and consumed input
   | ConsumedError ErrorMessage -- did not parse okay and consumed input

To something like:

data Consumed a = Empty a | Consumed a
data Reply s a = Ok s a | Error ErrorMessage

data ParseResult s a = Consumed (Reply s a)

This change allows us to determine whether a parser consumes input
before forcing the computation that determines if the parser succeeds,
improving performance and getting rid of a space leak.

My branch takes us back to returning the four flattened results, but
offers them as four continuations to take during parsing.

So my biggest reservation is if I can even call my branch the same
parser as Parsec.

Antoine