[Haskell-cafe] uu-parsinglib - Greedy Parser

Kyle Marek-Spartz kyle.marek.spartz at gmail.com
Tue Jan 13 15:45:02 UTC 2015


This is the difference between Parsing Expression Grammars and more
general Context-Free Grammars.

https://en.wikipedia.org/wiki/Parsing_expression_grammar

You'll have to re-write your grammar to be unambiguous if you want to
use a library based on CFGs.


Marco Vassena writes:

> In the uu-parsinglib the choice combinator (<|>) is symmetric and does not commit to any alternative.
> If the grammar being parsed is ambiguous, the corresponding parser will fail at run time on an ambiguous input.
>
> Consider for instance this html parser.
>
> pTag = pOpenTag <|> pCloseTag <|> pCommentTag <|> pContent
>
> pContent = Content <$> some (satisfy (/= '<'))
> pHtml = some pTag
>
> This parser will fail on "<a>123</a>", because this may be interpreted as:
> [ [Open a , Content "123", Close a],
>  [Open a, Content "12", Content "3", Close a],
>  [Open a, Content "1", Content "2", Content "3", Close a] ... ]
>
> In other parsing libraries such as parsec the choice operator <|> is greedy and commits to the first alternative that makes
> any progress, so that some is greedy and Content "123" would be parsed.
> The operator <<|> in uu-parsinglib has the same greedy behaviour.
>
> I would like to disambiguate the grammar so that the first result ([Open a, Content "123", Close a]) is selected (longest matching rule).
> I suppose that this may be done using <<|> to define a greedySome to be used in in pContent, however I am wondering whether
> there is another way to do this, without using such operator <<|>.
>
> Any help is appreciated.
>
> All the best,
> Marco
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe

--
Kyle Marek-Spartz


More information about the Haskell-Cafe mailing list