Advance notice that I'd like to make Cabal depend on parsec

Thu Mar 14 17:39:10 CET 2013

On Thu, Mar 14, 2013 at 7:53 AM, Duncan Coutts <duncan.coutts at googlemail.com
> wrote:

> Hi folks,
>
> I want to give you advance notice that I would like to make Cabal depend
> on parsec. The implication is that GHC would therefore depend on parsec
> and thus it would become a core package, rather than just a HP package.
> So this would affect both GHC and the HP, though I hope not too much.
>
> The rationale is that Cabal needs to parse things, like .cabal files and
> currently we do not have a decent parser in the core libraries. By
> decent I mean one that can produce error messages with source locations
> and that doesn't have unpredictable memory use. The only parser in the
> core libraries at the moment is Text.ParserCombinators.ReadP from the
> base package and that fails my "decent" criteria on both counts. Its
> idea of an error message is (), and on some largish .cabal files we take
> 100s of MB to parse (I realise that the ReadP in the base package is a
> cutdown version so I don't mean to malign all ReadP-style libs out
> there).
>
> Partly due to the performance problem, the terrible .cabal file error
> messages, and partly because Doaitse Swierstra keeps asking me if .cabal
> files have a grammar, I've been writing a new .cabal parser. It uses an
> alex lexer and a parsec parser. It's fast and the error messages are
> pretty good. I have reverse engineered a grammar that closely matches
> the existing parser and .cabal files in the wild, though I'm not sure
> Doaitse will be satisfied with the approach I've taken to handling
> layout.
>
> Why did I choose parsec? Practicality dictates that I can only use
> things in the core libraries, and the nearest thing we have to that is
> the parser lib that is in the HP. I tried to use happy but I could not
> construct a grammar/lexer combo to handle the layout (also, happy is not
> exactly known for its great error messages).
>

Failed attempt aside for a moment, I think you should reconsider happy. Can
you learn how to do layout from reading the GHC source? The happy
documentation that explains how to attach a monad (you could use it to
communicate between alex and happy for layout info) is a bit misleading but
I have examples I can share with you. I haven't specifically tackled the
layout problem but I could try to make a parser if it would help.

One major benefit of using happy is that the productions of the grammar can
be analyzed for shift/shift and shift/reduce conflicts. The equivalent
analysis doesn't appear to be possible in parsec. In theory, applicative
parsers should allow for this but my understanding is that parsec does not
have this feature for its applicative subset.

Other benefits are: a) GHC can certainly use parers generated by it, b) the
generated code uses common dependencies, c) it's fast, d) it's expressive.

What is it about happy parser errors that you don't like? Do you know
examples where parsec does a better job?

I have an alex + happy parser for a tiny functional language that I can
share with you if you'd like to give it another go. It doesn't support
layout at the moment, but I think I could add that.

Jason
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/cabal-devel/attachments/20130314/277db586/attachment-0001.htm>