[Haskell-cafe] ANNOUNCE: hierarchical-clustering and
gsc-weighting
Felipe Lessa
felipe.lessa at gmail.com
Tue Aug 3 07:23:25 EDT 2010
On Tue, Aug 3, 2010 at 8:01 AM, Ivan Lazar Miljenovic
<ivan.miljenovic at gmail.com> wrote:
> Felipe Lessa <felipe.lessa at gmail.com> writes:
>> 'hierarchical-clustering' provides a function to create a dendrogram
>> from a list of items and a distance function between them. The most
>> common linkage types are available: single linkage, complete linkage
>> and UPGMA. An item can be anything, for example a DNA sequence, so
>> this may used to create a phylogenetic tree.
>
> What actual clustering algorithm are you using here?
A naïve O(n^2) algorithm using a distance matrix. This can be
improved without changing the API, however.
> Also, would it be possible to have some more documentation there in
> general? At the very least, in your next release explain what a
> dendogram is and why someone would want to use your package (I had to do
> some quick wikipedia looking to refresh my memory on what dendogram,
> etc. were to get an understanding of what it does).
Documentation is always good, but I didn't want to take the time to
explain everything from the beginning. I guess most people coming to
this package will already know that they want a dendrogram. But if
they don't, a quick googling is very effective. Hmm, I guess some
diagrams would be nice.
I've took the time only to explain why there is an "UPGMA" and a
"FakeAverageLinkage", because that distinction isn't easy to find on
the web. Actually, I still haven't found someone talking about it,
just people using either with the same name "average linkage". =)
Cheers,
--
Felipe.
More information about the Haskell-Cafe
mailing list