[Haskell-cafe] A simple attoparsec question

Malcolm Wallace malcolm.wallace at me.com
Thu Mar 3 20:47:19 CET 2011


On 1 Mar 2011, at 21:58, Evan Laforge wrote:

>>  parseConstant = Reference <$> try parseLocLabel
>>              <|> PlainNum <$> decimal
>>              <|> char '#' *> fmap PlainNum hexadecimal
>>              <|> char '\'' *> (CharLit <$> notChar '\n') <* char '\''
>>              <|> try $ (char '"' *> (StringLit . B.pack <$>
>>                    manyTill (notChar '\n') (char '"')))
>>              <?> "constant"
>>
>> The problem is, that attoparsec just silently fails on this kind of
>> strings and tries other parsers afterwards, which leads to strange
>> results. Is there a way to force the whole parser to fail, even if
>> there's an alternative parser afterwards?

I _think_ what the original poster is worried about is that, having  
consumed an initial portion of a constant, e.g. the leading # or ' or  
", if the input does not complete the token sequence in a valid way,  
then the other alternatives are tried anyway (and hopelessly).  This  
can lead to very poor error messages.

The technique advocated by the polyparse library is to explicitly  
annotate the knowledge that when a certain sequence has been seen  
already, then no other alternative can possibly match.  The combinator  
is called 'commit'.  This locates the errors much more precisely.

For instance, (in some hybrid of polyparse/attoparsec combinators)

>>  parseConstant = Reference <$> try parseLocLabel
>>              <|> PlainNum <$> decimal
>>              <|> char '#' *> commit (fmap PlainNum hexadecimal)
>>              <|> char '\'' *> commit ((CharLit <$> notChar '\n') <*  
>> char '\'')
>>              <|> char '"' *> commit ((StringLit . B.pack <$>
>>                    manyTill (notChar '\n') (char '"')))
>>              <?> "constant"


Regards,
     Malcolm




More information about the Haskell-Cafe mailing list