[Haskell-cafe] What is your favourite Haskell "aha" moment?

Olaf Klinke olf at aatal-apotheke.de
Wed Jul 11 18:34:57 UTC 2018


Dear Simon,

since you'll be talking to genomics people, you might want to look at a Bachelor thesis about sequence alignment [1]. The author is a PhD student in Cambridge now. 
In a nutshell, the ListT monad transformer [2] is a good abstraction for full-text index structures [*] that are in heavy use among the genomics people. ListT represents the basic state transformation when a symbol is added to the query string. It also facilitates alignment of sequences with uncertain values [+]. Swapping out the monad lets you change the model of uncertainty, but you write your algorithm only once. I can provide code if required.

The edit distance algorithm [3] might also go down well at Sanger. It is an example where lazyness and subtle re-arrangement turns a quadratic algorithm into an optimal one. 

Olaf

[1] https://pp.ipd.kit.edu/uploads/publikationen/kuhnle13bachelorarbeit.pdf
[2] http://hackage.haskell.org/package/list-t
[3] http://users.monash.edu/~lloyd/tildeStrings/Alignment/92.IPL.html 
[*] Burrows-Wheeler transform, FM-index, suffix arrays, suffix trees, etc. 
[+] Current index structures hold only one sequence, while known genomic polymorphisms are stored separately.


More information about the Haskell-Cafe mailing list