[Haskell-cafe] ordNub

David Feuer david.feuer at gmail.com
Thu Jan 1 13:43:29 UTC 2015


Er.. that last might be to hard. But Data.IntSet can support its own nub
version.
On Jan 1, 2015 8:42 AM, "David Feuer" <david.feuer at gmail.com> wrote:

> I think ordNub in Data.Set, hashNub in Data.HashSet or somewhere, and
> bitsNub in Data.Bits (for FiniteBits things).
> On Jul 14, 2013 7:21 AM, "Niklas Hambüchen" <mail at nh2.me> wrote:
>
>> tldr: nub is abnormally slow, we shouldn't use it, but we do.
>>
>>
>> As you might know, Data.List.nub is O(n²). (*)
>>
>> As you might not know, almost *all* practical Haskell projects use it,
>> and that in places where an Ord instance is given, e.g. happy, Xmonad,
>> ghc-mod, Agda, darcs, QuickCheck, yesod, shake, Cabal, haddock, and 600
>> more (see https://github.com/nh2/haskell-ordnub).
>>
>> I've taken the Ord-based O(n * log n) implementation from yi using a Set:
>>
>>   ordNub :: (Ord a) => [a] -> [a]
>>   ordNub l = go empty l
>>     where
>>       go _ []     = []
>>       go s (x:xs) = if x `member` s then go s xs
>>                                     else x : go (insert x s) xs
>>
>>
>> and put benchmarks on
>>
>> http://htmlpreview.github.io/?https://github.com/nh2/haskell-ordnub/blob/1f0a2c94a/report.html
>> (compare `nub` vs `ordNub`).
>>
>> `ordNub` is not only in a different complexity class, but even seems to
>> perform better than nub for very small numbers of actually different
>> list elements (that's the numbers before the benchmark names).
>>
>> (The benchmark also shows some other potential problem: Using a state
>> monad to keep the set instead of a function argument can be up to 20
>> times slower. Should that happen?)
>>
>> What do you think about ordNub?
>>
>> I've seen a proposal from 5 years ago about adding a *sort*Nub function
>> started by Neil, but it just died.
>>
>>
>> (*) The mentioned complexity is for the (very common) worst case, in
>> which the number of different elements in the list grows with the list
>> (alias you don't have an N element list with always only 5 different
>> things inside).
>>
>> _______________________________________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe at haskell.org
>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20150101/263d4bfc/attachment.html>


More information about the Haskell-Cafe mailing list