Parsing CSV files

Keith Wansbrough Keith.Wansbrough@cl.cam.ac.uk
Tue, 29 Jul 2003 17:26:31 +0100


> Nevermind the previous version, I've solved a few bugs in it (like unquoted 
> numbers and correctly handling blank fields).

1. Any string without commas or newlines can be unquoted; no need to restrict it to digits.

2. In a quoted string, "" (that is, two double quotes) stands for one double quote character.  That is, the strings


  "This is a quoted string"
  K"oln

appear in a CSV file as

  """This is a quoted string"""
  "K""oln"

Also, be sure to support newlines within quoted strings (I think you do already).  Many CSV parsers fail to do this, with nasty results.

--KW 8-)
-- 
Keith Wansbrough <kw217@cl.cam.ac.uk>
http://www.cl.cam.ac.uk/users/kw217/
University of Cambridge Computer Laboratory.