[Haskell-cafe] Parsing words with parsec
paolino
paolo.veronelli at gmail.com
Thu Mar 29 23:43:34 EDT 2007
Hi,
I had a bad time trying to parse the words of a text.
I suspect I miss some parsec knowledge.
In the end it seems working, though I haven't tested much and this example
contains the main features I was looking.
*Main> parseTest (parseLine eof) "paolo at gmail sara,mimmo! 9ab a9b ab9 cd\n"
["paolo at gmail","sara","mimmo","cd"]
---------------------
manyTillT body terminator joiner = liftM2 joiner (manyTill body (lookAhead
terminator)) terminator
wordChar = letter <|> oneOf "_@" <?> "a valid word character"
nonSeparator = wordChar <|> digit
wordEnd = do
x <- wordChar
notFollowedBy nonSeparator
return x
word = manyTillT wordChar (try wordEnd) (\b t -> b ++ [t]) <?> "a word"
wordStart = do
(try nonSeparator >> unexpected "non separator") <|> anyChar
lookAhead wordChar
nextWord = manyTill anyChar (try wordStart) >> (try word <|> nextWord)
parseLine end = do
f <- option [] $ return `fmap` try word
r <- many $ try nextWord
manyTill anyChar end
return (f ++ r)
-----------
Any comment to simplify this code is welcome.
Paolino.
More information about the Haskell-Cafe
mailing list