[Haskell-cafe] Alternative instance for non-backtracking parsers

Doaitse Swierstra doaitse at swierstra.net
Thu Aug 23 20:19:21 UTC 2018


A common situation is where the “q" part is actually the parser which parses the empty string. It is not uncommon to make the choice for “p” once “p” has made some progress. Consider e.g. the situation where you parse a list of parameters:

(expr1, expr2, )

with a parser "pChainr pComma pExpr”

Obviously the example does not parse, but the question is what kind of error message do you expect? I would prefer the message that an expression is missing after the comma. Of course we could backtrack to the point before the comma and then report that we cannot handle the comma. For me this is counterintuitive. 

In the uu-parsinglib you can choose between two versions of pChainR:

-- * Combinators for chained structures
-- ** Treating the operator as right associative
pChainr    :: IsParser p => p (c -> c -> c) -> p c -> p c
pChainr    op x    =   must_be_non_empties "pChainr"    op   x r where r = x <??> (flip <$> op <*> r)
pChainr_ng :: IsParser p => p (c -> c -> c) -> p c -> p c
pChainr_ng op x    =   must_be_non_empties "pChainr_ng" op   x r where r = x <**> ((flip <$> op <*> r)  <|> pure id)

Do not pay attention to the function “must_be_non_empties” which just performs a check that not both the operator and the operand can recognise the empty string, which would build a non-sensical parser. The non-greedy parser does not commit to the choice made after seeing the next recognisable symbol. This may make a difference. What is done in our example erroneous example depends on the specified costs for inserting and deleting a specific symbol.

 Doaitse


> Op 23 aug. 2018, om 21:41  heeft Olaf Klinke <olf at aatal-apotheke.de> het volgende geschreven:
> 
> Dear Cafe,
> 
> can anyone explain the use of an Alternative instance for a parser monad where the parser (p <|> q) does not backtrack automatically when p fails? 
> Consider megaparsec for example. When p is a parser that does not backtrack automatically, then (p <|> q) never succeeds with q. While I do understand that fine-grained control of backtracking is desirable for performance reasons in general, it escapes me why such an Alternative instance could be useful. See also "A note on backtracking" in the parser-combinators package documentation [1].
> 
> Minimal code example and context below.
> 
> Thanks,
> Olaf
> 
> [1] http://hackage.haskell.org/package/parser-combinators-1.0.0/docs/Control-Applicative-Combinators.html
> 
> import Text.Megaparsec
> import Text.Megaparsec.Char
> import Data.Void
> 
> p :: Parsec Void String Bool
> p = do
>  char '-'
>  string "True"
>  return True
> q :: Parsec Void String Bool
> q = do
>  char '-'
>  string "False"
>  return False
> 
>>>> parseTest (p <|> q) "-True"
> True
>>>> parseTest (p <|> q) "-False"
> 1:2:
> unexpected "Fals"
> expecting "True"
> -- Since q can parse "-False", I would expect (p <|> q) to parse "-False", too. 
> 
> Context:
> Of course the example above can be mitigated by moving the char '-' outside the alternative. But we might not have control over whether the two parsers share a common prefix. I am trying to write some code generic in the parser monad and I am having difficulties making my code work for both backtracking and non-backtracking parsers. If possible, I'd like to keep my parser type class devoid of a 'try' combinator, which would be the identity function for parser monads that always backtrack.  
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.



More information about the Haskell-Cafe mailing list