[Haskell-cafe] ANNOUNCE: hierarchical-clustering and gsc-weighting

Felipe Lessa felipe.lessa at gmail.com
Tue Aug 3 07:23:25 EDT 2010


On Tue, Aug 3, 2010 at 8:01 AM, Ivan Lazar Miljenovic
<ivan.miljenovic at gmail.com> wrote:
> Felipe Lessa <felipe.lessa at gmail.com> writes:
>> 'hierarchical-clustering' provides a function to create a dendrogram
>> from a list of items and a distance function between them.  The most
>> common linkage types are available: single linkage, complete linkage
>> and UPGMA.  An item can be anything, for example a DNA sequence, so
>> this may used to create a phylogenetic tree.
>
> What actual clustering algorithm are you using here?

A naïve O(n^2) algorithm using a distance matrix.  This can be
improved without changing the API, however.

> Also, would it be possible to have some more documentation there in
> general?  At the very least, in your next release explain what a
> dendogram is and why someone would want to use your package (I had to do
> some quick wikipedia looking to refresh my memory on what dendogram,
> etc. were to get an understanding of what it does).

Documentation is always good, but I didn't want to take the time to
explain everything from the beginning.  I guess most people coming to
this package will already know that they want a dendrogram.  But if
they don't, a quick googling is very effective.  Hmm, I guess some
diagrams would be nice.

I've took the time only to explain why there is an "UPGMA" and a
"FakeAverageLinkage", because that distinction isn't easy to find on
the web.  Actually, I still haven't found someone talking about it,
just people using either with the same name "average linkage". =)

Cheers,

-- 
Felipe.


More information about the Haskell-Cafe mailing list