[Haskell-cafe] What is your favourite Haskell "aha" moment?
Olaf Klinke
olf at aatal-apotheke.de
Wed Jul 11 18:34:57 UTC 2018
Dear Simon,
since you'll be talking to genomics people, you might want to look at a Bachelor thesis about sequence alignment [1]. The author is a PhD student in Cambridge now.
In a nutshell, the ListT monad transformer [2] is a good abstraction for full-text index structures [*] that are in heavy use among the genomics people. ListT represents the basic state transformation when a symbol is added to the query string. It also facilitates alignment of sequences with uncertain values [+]. Swapping out the monad lets you change the model of uncertainty, but you write your algorithm only once. I can provide code if required.
The edit distance algorithm [3] might also go down well at Sanger. It is an example where lazyness and subtle re-arrangement turns a quadratic algorithm into an optimal one.
Olaf
[1] https://pp.ipd.kit.edu/uploads/publikationen/kuhnle13bachelorarbeit.pdf
[2] http://hackage.haskell.org/package/list-t
[3] http://users.monash.edu/~lloyd/tildeStrings/Alignment/92.IPL.html
[*] Burrows-Wheeler transform, FM-index, suffix arrays, suffix trees, etc.
[+] Current index structures hold only one sequence, while known genomic polymorphisms are stored separately.
More information about the Haskell-Cafe
mailing list