[Data-haskell] Exploring the Titanic dataset with Haskell

Nikita Tchayka nikitatchayka at gmail.com
Tue Mar 7 23:23:56 UTC 2017


Hello dataHaskell!

I'd like to inaugurate our mailing list by showing something that I'm
working on for our documentation site.

After checking some notebooks in Kaggle, this notebook
<https://www.kaggle.com/mrisdal/titanic/exploring-survival-on-the-titanic> by
Megan Risdal really caught my eye. It looks really complete
and also, accomplishes a lot of tasks ranging from easy ones as simple
feature engineering by extracting titles from the
names, to more complex ones like applying a Random Forest. It also includes
visualization, which is cool too.

I'm currently working on porting it to Haskell, so we can see what's
missing and what's there. I'm using Eric Conlon's (@ejconlon)
Analyze library, and even though its still young, I absolutely love it. It
allows easy CSV loading and there are no name clashes
like in Frames, nor you have to define your datatypes first as in Cassava.

The notebook itself can be found here
<https://github.com/NickSeagull/ex01-exploring-titanic/blob/master/src/Lib.hs>,
it's in a repo in my Github and it can be loaded in HaskellDO, although
right now I'm using
Vim + Stack REPL, until error highlighting is implemented.

Cheers
-- 
nikita tchayka . software craftsman
 { nickseagull.github.io }
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/data-haskell/attachments/20170307/fb86ccd7/attachment.html>


More information about the Data-Haskell mailing list