[Haskell-cafe] A simple parsing task for parsec.

paolino paolo.veronelli at gmail.com
Thu Mar 29 17:00:41 EDT 2007


Hi, 
	I had a bad time trying to parse the words of a text.
I suspect I miss some parsec knowledge.

In the end it seems working, though I haven't tested much and this example 
contains the main features I was looking.

*Main> parseTest (parseLine eof) "paolo at gmail sara,mimmo! 9ab a9b ab9 cd\n"
["paolo at gmail","sara","mimmo","cd"]

---------------------
manyTillT body terminator joiner = liftM2 joiner (manyTill body (lookAhead  
terminator)) terminator

wordChar = letter <|> oneOf "_@" <?> "a valid word character"

nonSeparator = wordChar <|> digit

wordEnd = do 
             x <- wordChar
             notFollowedBy nonSeparator
             return x

word = manyTillT wordChar (try wordEnd) (\b t -> b ++ [t]) <?> "a word"

wordStart = do 
               (try nonSeparator >> unexpected "non separator") <|> anyChar
               lookAhead wordChar

nextWord =  manyTill anyChar (try wordStart) >> (try word <|> nextWord)

parseLine end = do 
                   f <- option [] $ return `fmap` try word
                   r <- many $ try nextWord
                   manyTill anyChar end
                   return (f ++ r)               

-----------

Any comment to simplify this code is welcome.


Paolino.


More information about the Haskell-Cafe mailing list