[Haskell-cafe] Loading a csv file with ~200 columns into Haskell Record
Guru Devanla
gurudev.devanla at gmail.com
Sun Oct 1 01:30:23 UTC 2017
Hello All,
I am in the process of replicating some code in Python in Haskell.
In Python, I load a couple of csv files, each file having more than 100
columns into a Pandas' data frame. Panda's data-frame, in short is a
tabular structure which lets me performs on bunch of joins, and filter out
data. I generated different shapes of reports using these operations. Of
course, I would love some type checking to help me with these merge, join
operations as I create different reports.
I am not looking to replicate the Pandas data-frame functionality in
Haskell. First thing I want to do is reach out to the 'record' data
structure. Here are some ideas I have:
1. I need to declare all these 100+ columns into multiple record
structures.
2. Some of the columns can have NULL/NaN values. Therefore, some of the
attributes of the record structure would be 'MayBe' values. Now, I could
drop some columns during load and cut down the number of attributes i
created per record structure.
3. Create a dictionary of each record structure which will help me index
into into them.'
I would like some feedback on the first 2 points. Seems like there is a lot
of boiler plate code I have to generate for creating 100s of record
attributes. Is this the only sane way to do this? What other patterns
should I consider while solving such a problem.
Also, I do not want to add too many dependencies into the project, but open
to suggestions.
Any input/advice on this would be very helpful.
Thank you for the time!
Guru
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/haskell-cafe/attachments/20170930/d338e318/attachment.html>
More information about the Haskell-Cafe
mailing list