<div dir="ltr">If your data is originating from a DB, read the DB schema and use code-gen or TH to generate your record structure. Please confirm that your Haskell data pipeline is able to handle 100-field+ records beforehand. I have a strange feeling that some library or the other is going to break at the 64-field mark.<div><br></div><div>If you don't have access to the underlying DB, read the CSV header and code-gen your data structures. This will still lead to a lot of boilerplate because your code-gen script will need to maintain a col-name<>data-type mapping. See if you can peek at the first row of the data and take an educated guess about each column's data-type based on the column values. This will not be 100% accurate, but you can get good results by manually specifying only a few data-types instead of the entire 100+ data-types.</div><div><br></div><div>-- Saurabh.</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Sun, Oct 1, 2017 at 4:38 PM, Leandro Ostera <span dir="ltr"><<a href="mailto:leandro@ostera.io" target="_blank">leandro@ostera.io</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Two things come to mind.<br><br>The first one is *Crazy idea, bad pitch*: generate the record code from the data.<br><br>The second is to make the records dynamically typed:<br><br>Would it be simpler to define a Column type you can parameterize with a string for its name (GADTs?) so you automatically get a type of that specific column?<br><br>That way as you read the CSV files you could define the type of the columns based on the actual column name.<br><br>Rows would then become sets of pairings of defined columns and values, perhaps having a Maybe would encode that any given value for a particular column is missing. You could encode these pairings a list too.<br><br>At least there you can have type guarantees that you’re joining fields that are of the same column type. I think.<br><br>Either way, my 2 cents and keep it up!<br><br><br><div class="gmail_quote"><div><div class="h5"><div dir="ltr">sön 1 okt. 2017 kl. 03:34 skrev Guru Devanla <<a href="mailto:gurudev.devanla@gmail.com" target="_blank">gurudev.devanla@gmail.com</a>>:<br></div></div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div class="h5"><div dir="ltr"><div><div><div>Hello All,<br></div><div><br></div>I am in the process of replicating some code in Python in Haskell.<br><br></div>In Python, I load a couple of csv files, each file having more than 100 columns into a Pandas' data frame. Panda's data-frame, in short is a tabular structure which lets me performs on bunch of joins, and filter out data. I generated different shapes of reports using these operations. Of course, I would love some type checking to help me with these merge, join operations as I create different reports.<br> <br>I am not looking to replicate the Pandas data-frame functionality in Haskell. First thing I want to do is reach out to the 'record' data structure. Here are some ideas I have:<br><br></div><div>1. I need to declare all these 100+ columns into multiple record structures.<br></div><div>2. Some of the columns can have NULL/NaN values. Therefore, some of the attributes of the record structure would be 'MayBe' values. Now, I could drop some columns during load and cut down the number of attributes i created per record structure. <br></div><div>3. Create a dictionary of each record structure which will help me index into into them.'<br><br></div><div>I would like some feedback on the first 2 points. Seems like there is a lot of boiler plate code I have to generate for creating 100s of record attributes. Is this the only sane way to do this? What other patterns should I consider while solving such a problem. <br><br></div><div>Also, I do not want to add too many dependencies into the project, but open to suggestions.<br><br></div><div>Any input/advice on this would be very helpful.<br></div><div><br></div><div>Thank you for the time!<br></div><div>Guru<br></div></div></div></div>
______________________________<wbr>_________________<br>
Haskell-Cafe mailing list<br>
To (un)subscribe, modify options or view archives go to:<br>
<a href="http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe" rel="noreferrer" target="_blank">http://mail.haskell.org/cgi-<wbr>bin/mailman/listinfo/haskell-<wbr>cafe</a><br>
Only members subscribed via the mailman list are allowed to post.</blockquote></div>
<br>______________________________<wbr>_________________<br>
Haskell-Cafe mailing list<br>
To (un)subscribe, modify options or view archives go to:<br>
<a href="http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe" rel="noreferrer" target="_blank">http://mail.haskell.org/cgi-<wbr>bin/mailman/listinfo/haskell-<wbr>cafe</a><br>
Only members subscribed via the mailman list are allowed to post.<br></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><a href="http://www.saurabhnanda.com" target="_blank">http://www.saurabhnanda.com</a></div>
</div>