Performance horrors

Duncan Coutts duncan.coutts at worc.ox.ac.uk
Tue Aug 26 19:56:32 EDT 2008


On Tue, 2008-08-26 at 22:52 +0100, Adrian Hey wrote:
> Gwern Branwen wrote:
> > FWIW, when I tested out a number of nub/uniq definitions, the map head version performed the worst, much worse than the O(n^2) one everyone is complaining about. It's a neat definition, but not performant.
> 
> When did you try this? IIRC correctly even the old sort was O(n^2), but
> someone had the sense to replace it a while ago.
> 
> With ghci now on my machine..
>   length $ map head $ group $ sort [1..100000]
> finishes "instantly", but..
>   length $ nub [1..100000]
> takes about 90 seconds.

Also, sorting followed by grouping is unnecessary extra work. Sorts that
discard duplicates are usually simple modifications of the sort
algorithm.

Though as people have pointed out, nub is nice because it is lazy, so
sorting is out. An ord based nub should accumulate previously seen
values so that it can operate lazily too.

Duncan



More information about the Libraries mailing list