[Haskell-cafe] Re: A parsec question

Christian Maeder Christian.Maeder at dfki.de
Thu Sep 30 06:33:07 EDT 2010


Am 29.09.2010 20:01, schrieb Daniel Fischer:
> On Wednesday 29 September 2010 19:10:22, Ben Franksen wrote:
>>>
>>> Note the last line mentions only '}'. I would rather like to see
>>>
>>>   expecting "}" or digit
>>>
>>> since the parser could very well accept another digit here.
> 
> parsec2 did that, I don't know whether that change is intentional or 
> accidental.

Right, parsec2 or parsec-2.1.0.1 still does so. (parsec-3 behaves
differently wrt error messages.)

Try "ghc-pkg hide parsec" so that parsec-2.1.0.1 will be taken:

 import Text.ParserCombinators.Parsec
 import Control.Monad

 infixl 1 <<

 (<<) :: Monad m => m a -> m b -> m a
 (<<) = liftM2 const

 block p = char '{' >> p << char '}'
 parser = block (many (digit))
 main = parseTest parser "{123a}"

*Main> main
Loading package parsec-2.1.0.1 ... linking ... done.
parse error at (line 1, column 5):
unexpected "a"
expecting digit or "}"

>>> (1) What is the reason for this behaviour?
>>> (2) Is there another combinator that behaves as I would like?
>>> (3) Otherwise, how do I write one myself?

ask derek.a.elkins at gmail.com (CCed)

Cheers Christian

>>
>> I just saw that Christian Maeder answered a similar question recently. I
>>
>> tried his suggestion of using manyTill and bingo:
>>> {-# LANGUAGE NoMonomorphismRestriction #-}
>>> import Control.Applicative ((*>),(<*))
>>> import Text.Parsec
>>> block p = char '{' *> p <* char '}'
>>> parser = block (manyTill digit (char '}'))
>>> main = parseTest parser "{123a}"
>>
>> gives
>>
>>   parse error at (line 1, column 5):
>>   unexpected "a"
>>   expecting "}" or digit
>>
>> So far so good. I wonder whether this parser is as efficient as the
>> original one.
> 
> manyTill p end      = scan
>                     where
>                       scan  = do{ end; return [] }
>                             <|>
>                               do{ x <- p; xs <- scan; return (x:xs) }
> 
> I'm not sure, but I suspect it's less efficient.
> 
> Perhaps
> 
> manyTill' p end = scan []
>     where
>       scan acc = do { end; return (reverse acc) }
>                 <|> do { x <- p; scan (x:acc) }
> 
> is more efficient (depends on Parsec's bind which is more efficient), you 
> could test.
> 
>> Also, this style is less modular, as I have to mention the
>> terminator in two places.
> 
> That's not the main problem. `manyTill' consumes the ending token, so
> 
> block (manyTill whatever (char '}')) needs two '}' to succeed.
> You would need
> 
> block (manyTill digit (lookAhead (char '}'))
> 
> to replicate the behaviour of block (many digit).
> 
>> Is there a non-greedy variant of 'many' so
>> that modularity gets restored and efficiency is not lost?
>>
>> Cheers
>> Ben


More information about the Haskell-Cafe mailing list