[Haskell-cafe] Is this Parsec code idiomatic?
daniel.is.fischer at web.de
Fri Sep 4 16:05:18 EDT 2009
Am Freitag 04 September 2009 21:23:35 schrieb Serge LE HUITOUZE:
> Hi Haskellers,
> I'm asking some advice on a small piece of code representing a
> simplified version of a treatment I need to perform.
> I have a line-oriented string/file, from which I want to extract
> only a substring of those lines starting with char '+' (the detail
> of the extraction is irrelevant here, I'll just return what follows
> the '+').
> [I also simplified the "eol" parser for shorter code.]
> I came out with the code below.
> The line parser returns a "Maybe String".
> The complete parser return a "[Maybe String]" by mere concatenation.
> The main function filters the 'Nothing' with 'catMaybes'.
> > import Text.ParserCombinators.Parsec
> > import Data.Maybe
> > maybePlusFile :: GenParser Char st [Maybe String]
> > maybePlusFile = endBy maybePlusLine eol
> > maybePlusLine :: GenParser Char st (Maybe String)
> > maybePlusLine = try (do char('+')
No need for try here, char '+' either fails without consuming input or the whole branch
> > result <- many (noneOf "\n")
> > return $ Just result)
> > <|> do many (noneOf "\n")
> > return $ Nothing
> > eol = char '\n'
maybePlusLine = do
fmap Just $ manyTill anyChar eol
skipMany (noneOf "\n")
maybePlusFile = many maybePlusLine
> > selectPlus :: String -> Either ParseError [String]
> > selectPlus input =
> > case parse maybePlusFile "(input)" input of
> > Left e -> Left e
> > Right mblist -> Right $ catMaybes mblist
> This works as expected (or so it seems), as the ghci dump shows:
> > GHCi, version 6.10.1: http://www.haskell.org/ghc/ :? for help
> > ...
> > Prelude> :l selectPlus.hs
> > [1 of 1] Compiling Main ( selectPlus.hs, interpreted )
> > Ok, modules loaded: Main.
> > *Main> selectPlus "abc\n+123\ndef\n+456\n"
> > Loading package parsec-18.104.22.168 ... linking ... done.
> > Right ["123","456"]
> > *Main>
> I'd like to know if this code is good style, and how you would
> possibly improve it.
Except for the superfluous try, it's good.
Unless you can't guarantee that the input ends with a newline. Then you should take care
of that (for example
eol = char '\n' <|> (eof >> return '\n')
If it's such a simple extraction, there's no need for a parser, however,
map tail . filter startsPlus . lines $ input
startsPlus ('+':_) = True
startsPlus _ = False
will do fine.
> Thanks in advance.
More information about the Haskell-Cafe