[Haskell-cafe] parsing a CSV file

Roger Mason rmason at mun.ca
Tue May 21 16:52:53 CEST 2013


Hello,

I'm attempting to write a parser for files that look like this:

Bruker Nano GmbH Berlin, Germany
Esprit 1.9

Date: 02/05/2013 10:06:49 AM
Real time: 15000
Energy Counts
-0.474    0
.....

The line before the ellipsis is repeated many times (such lines 
represents a spectrum).  I need to be able to extract numbers from lines 
containing <string: > and I want to extract the number pairs following 
"Energy Counts\n".  The extracted data will then be written to a file in 
a different format.  For now I'll be satisfied with reading the "header" 
info, i.e. down to "Energy Counts\n".

Thus far, I have:
-- derived from RWH
-- file: ch16/csv2.hs
import Text.ParserCombinators.Parsec

headerLines = endBy csvFile endHeader
csvFile = endBy line eol
line = sepBy cell (char ',')
cell = many (noneOf ",\n")
eol = char '\n'

parseCSV :: String -> Either ParseError [[String]]
parseCSV input = parse csvFile "(unknown)" input

parseHDR :: String -> Either ParseError [[String]]
parseHDR input = parse headerLines "(unknown)" input

endHeader = string "Energy Counts"

This loads into GHCi (7.6.2) OK.  However, when I test it:

parseHDR "Bruker Nano GmbH Berlin, Germany\nEsprit 1.9\n\nDate: 
02/05/2013 10:06:49 AM\nReal time: 15000\nEnergy Counts"

Not in scope: `parseHDR'

which makes sense because

ghci> :t endHeader

<interactive>:1:1: Not in scope: `endHeader'

Clearly, my naiive implementation of endHeader is no good.

I appreciate any pointers.

Thanks,
Roger


This electronic communication is governed by the terms and conditions at
http://www.mun.ca/cc/policies/electronic_communications_disclaimer_2012.php



More information about the Haskell-Cafe mailing list