[Haskell-cafe] implementing a csv reader

Tomasz Zielonka tomasz.zielonka at gmail.com
Tue Aug 22 17:04:52 EDT 2006


On Tue, Aug 22, 2006 at 08:59:40AM -0700, Daan Leijen wrote:
> > > 2. I am looking for a parser, but I don't know Haskell parsers.  Is
> > >    Parsec a good choice?
> >
> > Parsec is definitely a good choice, but beware that it parses the whole
> > input before returning, thus it may consume a huge batch of memory. As
> > CSV is a line oriented format, you should make your parser lazy. Search
> > the mailing list archive for "lazy parser".
> 
> A good trick here is to first use "lines" to break up the input into
> lines and than map a Parsec parse for each line to those lines
> (returning a list of Maybe a or ParseError a results).

You can also create a lazy "many" parser using (get|set)ParserState. The
benefit is that this will also work if your elements are not in one-to-one
relation with lines and that it automatically takes care of maintaining
position in the input (for error messages).

  lazyMany :: GenParser Char () a -> SourceName -> [Char] -> [a]
  lazyMany p file contents = lm state0
    where
      Right state0 = parse getParserState file contents

      lm state = case parse p' "" "" of
                      Left err -> error (show err)
                      Right x -> x
        where
          p' = do
            setParserState state
            choice
              [ do
                  eof
                  return []
              , do
                  x <- p
                  state' <- getParserState
                  return (x : lm state')
              ]

Best regards
Tomasz


More information about the Haskell-Cafe mailing list