Proposal #2629: Data.List: Replace nub; add nubOrd, nubInt, nubWith

Sun Sep 28 14:48:01 EDT 2008

I also think that it is better to use RULES and to hide at least
nubInt. For nubOrd I am not sure whether it should be exported or
hidden because it requires different type class.

Isn't the usage of multi-parameter type class with functional
dependencies and flexible instances an overkill for such a simple job?
I would write nubWith in this way:

nubWith :: (e -> s -> Maybe s) -> s -> [e] -> [e]
nubWith _ _ [] = []
nubWith f  s (e : es)
   case f e s of
      Just s   -> e : nubWith f s es
      Nothing -> nubWith f s es

It is portable and perhaps more flexible. With your type class you
assume that it is efficient and possible to have separated memberS and
insertS which might not be the case. If you have only one function
then you leave the decision to the implementor.

Regards,
  Krasimir

On Sun, Sep 28, 2008 at 8:05 PM, Alexander Dunlap
<alexander.dunlap at gmail.com> wrote:
> On Sun, Sep 28, 2008 at 12:49 AM, Bart Massey <bart at cs.pdx.edu> wrote:
>> http://hackage.haskell.org/trac/ghc/ticket/2629
>>
>> Everyone always complains about nub, but nobody ever does
>> anything about it except (map head . group . sort), which
>> isn't lazy and isn't always faster. :-)
>>
>> I've implemented a new function nubWith that takes a
>> "stop list" as an argument and filters its target list
>> against the stop list.  I've then re-implemented nub and
>> implemented nubOrd and nubInt in terms of nubWith: the stop
>> list is a typeclass, so these implementations are trivial
>> and new implementations are easily added.  nubBy is left
>> alone, since there's nothing obvious to be done about it.
>> All of the nubs are still fully lazy.
>>
>> Basic QuickCheck tests are provided, and pass.
>>
>> Performance benchmarking code is provided.  The performance
>> of my nub implementation is quite comparable to that of
>> the standard one.  My nubOrd and nubInt implementations
>> are dramatically faster, since they use a Set and IntSet
>> respectively for the stop list.  In particular, they
>> are performant on long lists with long nubs, unlike the
>> basic nub.
>>
>> My implementation is available via git at
>>  git://svcs.cs.pdx.edu/git/nub.git
>> or can be browsed at
>>  http://svcs.cs.pdx.edu/gitweb?p=nub.git;a=tree
>> and has a maybe-outdated tarball at
>>  http://svcs.cs.pdx.edu/haskell/nub.tar.gz
>> The Nub.hs file itself is attached to the proposal.
>> If the proposal is accepted, I will prepare a patch
>> against current GHC library top-of-tree, but for now it
>> seems easier for everyone to just look at the bits in
>> their current natural habitat.
>>
>>
>> _______________________________________________
>> Libraries mailing list
>> Libraries at haskell.org
>> http://www.haskell.org/mailman/listinfo/libraries
>>
>
> Hi all,
>
> This seems like a good idea but it's kind of strange to have three
> different exposed versions of nub. Would it be possible to hide them,
> hide the StopList typeclass and use {-# RULES #-} pragmas to use the
> faster versions when possible?
>
> Alex
> _______________________________________________
> Libraries mailing list
> Libraries at haskell.org
> http://www.haskell.org/mailman/listinfo/libraries
>