[Haskell-cafe] ordNub
Thomas DuBuisson
thomas.dubuisson at gmail.com
Mon Jul 15 05:16:41 CEST 2013
Just so people are aware - five years ago the notion of nubOrd and
nubWith was discussed and a consensus reached on including nubOrd. I
think Bart got too busy, didn't submit a final patch, and no one with
commit access actually commited any code.
http://haskell.1045720.n5.nabble.com/GHC-2717-Add-nubWith-nubOrd-td3159919.html
I fully support an efficient nub implementation making its way into
base - it's far past time. Using Set seems sensible.
Cheers,
Thomas
On Sun, Jul 14, 2013 at 4:20 AM, Niklas Hambüchen <mail at nh2.me> wrote:
> tldr: nub is abnormally slow, we shouldn't use it, but we do.
>
>
> As you might know, Data.List.nub is O(n²). (*)
>
> As you might not know, almost *all* practical Haskell projects use it,
> and that in places where an Ord instance is given, e.g. happy, Xmonad,
> ghc-mod, Agda, darcs, QuickCheck, yesod, shake, Cabal, haddock, and 600
> more (see https://github.com/nh2/haskell-ordnub).
>
> I've taken the Ord-based O(n * log n) implementation from yi using a Set:
>
> ordNub :: (Ord a) => [a] -> [a]
> ordNub l = go empty l
> where
> go _ [] = []
> go s (x:xs) = if x `member` s then go s xs
> else x : go (insert x s) xs
>
>
> and put benchmarks on
> http://htmlpreview.github.io/?https://github.com/nh2/haskell-ordnub/blob/1f0a2c94a/report.html
> (compare `nub` vs `ordNub`).
>
> `ordNub` is not only in a different complexity class, but even seems to
> perform better than nub for very small numbers of actually different
> list elements (that's the numbers before the benchmark names).
>
> (The benchmark also shows some other potential problem: Using a state
> monad to keep the set instead of a function argument can be up to 20
> times slower. Should that happen?)
>
> What do you think about ordNub?
>
> I've seen a proposal from 5 years ago about adding a *sort*Nub function
> started by Neil, but it just died.
>
>
> (*) The mentioned complexity is for the (very common) worst case, in
> which the number of different elements in the list grows with the list
> (alias you don't have an N element list with always only 5 different
> things inside).
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
More information about the Haskell-Cafe
mailing list