[Haskell-cafe] [Parsec] No identEnd in ParsecToken?

Chris Kuklewicz haskell at list.mightyreason.com
Tue Sep 5 10:46:16 EDT 2006


Stephane Bortzmeyer wrote:
> I'm trying to use Parsec for a language which have identifiers where
> the '-' character is allowed only inside identifiers, not at the start
> or the end.
> 
> ParsecToken has identStart to tell that the '-' is not allowed at the
> start but I find no equivalent identEnd?

I have not used ParsecToken

> 
> I tried also to express the same rule with ordinary combinators,
> without ParsecToken but this fails:
> 
> identifier = do
>     start <- letter
>     rest <- many (alphaNum <|> char '-') 
>     end <- letter       
>     return ([start] ++ rest ++ [end])
>   <?> "characters authorized for identifiers"
> 
> because the parser created by "many" is greedy: it consumes
> everything, including the final letter.
> 
> Any idea?

The hard thing about using Parsec is to know how to combine <|> with 'try'.

Fixing this may be as simple as

 > identifier = try $ do
 >     start <- letter
 >     rest <- many (alphaNum <|> char '-')
 >     end <- letter
 >     return ([start] ++ rest ++ [end])
 >   <?> "characters authorized for identifiers"

Alternatively, if the first character being a letter commits you to an 
identifier or a syntax error, then you could move the try after the first letter 
has been read and committed to:

> identifier = do
>     start <- letter
>     try $ do
>       rest <- many (alphaNum <|> char '-') 
>       end <- letter
>       return (start:(rest ++ [end]))
>   <?> "characters authorized for identifiers"

(Both untested)

And can the last letter be an alphaNum instead of only a letter?

You can also make the test more explicit:

 > import Data.Char; import Control.Monad;
 >
> identifier = try $ do
>   start <- letter
>   rest <- many (satisfy (\c -> alphaNum c || (c=='-')))
>   when (not (null rest) && '-' == last rest) (unexpected "Identifier cannot end in -")
>   return (start:rest)
or
> identifier = do
>   start <- letter <?> "Identifiers must start with a letter"
>   try $ do
>     rest <- many (satisfy (\c -> alphaNum c || (c=='-'))) <?> "valid identifier character"
>     when (not (null rest) && '-' == last rest) (unexpected "identifier cannot end in -")
>     return (start:rest)


More information about the Haskell-Cafe mailing list