CSV parser, quotes?
Graham Klyne
GK@ninebynine.org
Wed, 30 Jul 2003 21:56:19 +0100
The Parsec library code contains string-parsing (with escapes) that you
might be able to use or canibalize. See 'StringLiteral' et seq in
ParsecToken.hs. The escape handling there is a fair amount of code, bit it
covers a lot of escape options.
#g
--
At 14:54 30/07/03 -0500, Shawn P. Garbett wrote:
>In attempting to improve the CSV parser based on comments, I have the
>following code that's attached. I'm having a heck of a time getting the
>double quotes = an escaped quote thing to work. There is some commented out
>code which was my last attempt. As it stands the code works, minus the
>escaped quotes. Remove the comments and it just hangs.
>
>module CSV (contents, csv) where
>
>import Parsec
>
>----------------------------------------------------------------------
>-- CSV Module
>----------------------------------------------------------------------
>
>-- Useful common parsers
>comma :: Parser Char
>comma = char ','
>
>quote :: Parser Char
>quote = char '\"'
>
>-- How to handle these buggers....
>--esc_quote :: Parser Char
>--esc_quote = do {char '\"'; char '\"';}
>
>--text :: Parser String
>--text = do {esc_quote; t<-text; return ('\"':t)}
>-- <|> do {c <- noneOf "\""; t<-text; return (c:t) }
>-- <|> return ""
>
>-- A cell can be a quoted value, a number or empty
>-- Quotes can be embedded by using double quotes ""
>cell :: Parser String
>cell = between quote quote (many (noneOf "\"")) -- quoted values
>--cell = between quote quote text -- quoted values
> <|> many1 (noneOf "\",\n") -- unquoted values
> <|> return "" -- give up, Empty
> cell
>
>-- Group of cells with a newline
>cells :: Parser [String]
>cells = do c <- sepBy cell comma
> newline
> return c
>
>-- Comma Separated Values, set of rows followed by eof
>csv :: Parser [[String]]
>csv = manyTill cells eof
>
>-- Useful For extracting comma delimited values of a cell
>contents :: Parser [String]
>contents = sepBy1 (many (noneOf ",")) comma
>
>_______________________________________________
>Haskell-Cafe mailing list
>Haskell-Cafe@haskell.org
>http://www.haskell.org/mailman/listinfo/haskell-cafe
-------------------
Graham Klyne
<GK@NineByNine.org>
PGP: 0FAA 69FF C083 000B A2E9 A131 01B9 1C7A DBCA CB5E