[Haskell-cafe] Alternative instance for non-backtracking parsers
Paul
aquagnu at gmail.com
Mon Aug 27 09:52:03 UTC 2018
Hello, Olaf. I have some distrust of elegant solutions (one of them are
C.P. libs). For example, I read that they have problems with memory
consuming (famous article about arrow-based parsers), also your example
(my own experience includes a lot of tricks which I don't like)... So
(but this is my IMHO) for big and serious project with parsing of
something complex I will use bindings to good known parser libraries...
Maybe something like this: https://github.com/markwright/antlrc. Btw,
Happy looks still live (https://github.com/simonmar/happy)
I use "Read"-style parsers for small constructions only, so fail of such
small parser is only case where error is producing and I do it
explicitly (as biggest example: header of Git patch).
Olaf, what do you think about
1. http://hackage.haskell.org/package/frisby
2. http://tanakh.github.io/Peggy
?
25.08.2018 01:15, Olaf Klinke wrote:
> Thanks to Doaitse for the example. I yet have to wrap my head around it (since I don't know how pChainr is related to <|>) but playing with q parsing the empty string is a good start.
> When you said
>
>> One might expect <|> to be associative, but even that is in no way guaranteed.
> I suppose you meant that type class laws can not be enforced, because the documentation of Alternative states that <|> should be associative. It would be quite horrible if the Alternative instances of parsers weren't.
>
> @Paul: I think that parser combinator libraries are one of the most wonderful things functional programming has given us. How else would you parse your input data? Anything cobbled together manually with the help of some Read instances is bound to be unmaintainable and does not yield good error messages. What would you suggest as a replacement? How would you write composable parsers for a complicated file type?
>
> Today I added a 'try' method to my MonadParse type class. This lets me write (try p <|> q) generically. An instance for Attoparsec would declare try=id, since it always backtracks. We'll see how this goes.
>
> What I learned form this thread is:
> * arrange your usages of <|> so that both sides can never share a common prefix, then no backtracking will be necessary
> * for that reason, some parsers like Megaparsec define their Alternative instances the way they do, without backtracking.
>
> Please allow one more question:
> Is the backtracking performance penalty linear in the amount of backtracking? That is, suppose that
> p = char '-' >> someHugeParser
> q = char '-' >> anotherHugeParser
> Could (try p <|> q) decide after seeing two characters that p fails and commit to the second branch, freeing memory?
>
> Thanks,
> Olaf
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.
More information about the Haskell-Cafe
mailing list