[Haskell-cafe] Sneaking haskell in the workplace -- cleaning
csv files
Jim Burton
jim at sdf-eu.org
Sat Jun 16 07:08:22 EDT 2007
Tomasz Zielonka wrote:
> On Fri, Jun 15, 2007 at 11:31:36PM +0100, Jim Burton wrote:
>> I think that would only work if there was one column per line...I didn't
>> make it clear that as well as being comma separated, the delimiter is
>> around each column, of which there are several on a line so if the
>> delimiter is ~ a file might look like:
>>
>> ~sdlkfj~, ~dsdkjf~ #eo row1
>> ~sdf
>> dfkj~, ~dfsd~ #eo row 2
>
> It would be easier to experiment if you could provide us with an
> example input file. If you are worried about revealing sensitive
> information, you can change all characters other then newline,
> ~ and , to "A"s, for example. An accompanying output file, for checking
> correctness, would be even nicer.
>
Hi Tomasz, I can do that but they do essentially look like the example
above, except with 10 - 30 columns, more data in each column, and more
rows, maybe this side of a million. They are produced by an Oracle
export which escapes the delimiter (often a tilde) from within the cols.
The output file should have exactly one row per line, with extra
newlines replaced by a string given as a param (it might be a space or a
html tag -- I only just remembered this and my initial effort doesn't do
it).
Thanks,
Jim
> Best regards
> Tomek
>
>
More information about the Haskell-Cafe
mailing list