Proposal: add ordNub somewhere in containers

David Feuer david.feuer at gmail.com
Mon Oct 16 22:44:41 UTC 2017


I would imagine

ordNub :: (Foldable t, Ord a)
       => t a -> [a]
ordNub xs = foldr go (const []) xs Set.empty where
  go x r s
    | x `Set.member` s = r s
    | otherwise = x : r (Set.insert x s)

which would suggest also

ordNubR :: (Foldable t, Ord a)
       => t a -> [a]
ordNubR xs = foldl go (const []) xs Set.empty where
  go r x s
    | x `Set.member` s = r s
    | otherwise = x : r (Set.insert x s)

For containers biased the other way.

Another question: do you have anything else you think should be stuck in
the new module?

On Oct 16, 2017 6:18 PM, "Gershom B" <gershomb at gmail.com> wrote:

There have been many discussions over the years about adding an
efficient order preserving nub somewhere to our core libraries. It
always comes down to the same issue: an efficient nub wants to be
backed by an efficient `Set`, but the API of the `nub` itself doesn't
make reference to any other data structures besides lists. So it feels
a bit conceptually strange to put an efficient nub anywhere besides
`Data.List` even though it can't go there without inverting our
dependency tree in a weird way or inlining an efficient set
implementation into the middle of it.

Nonetheless, the convenience of having a good `nub` lying around in a
core library is undeniable, and after writing the "usual" one in my
code for the zillionth time, I decided to raise an issue about it:

https://github.com/haskell/containers/issues/439

I was promptly directed here to make a proper proposal.

So, here:

1) I propose two new functions,

`ordNub` and `intNub`

with the standard implementation (from https://github.com/nh2/haskell-ordnub
):

import qualified Data.Set as Set

ordNub :: (Ord a) => [a] -> [a]
ordNub l = go Set.empty l
  where
    go _ [] = []
    go s (x:xs) = if x `Set.member` s then go s xs
                                      else x : go (Set.insert x s) xs

and the same implementation, but specialized to `Int` and using `IntSet`s.

The rationale for the names is that the former has a long history of
use in folklore, and the latter is the obvious specialization of it.

2) I propose these functions be added to a new module in the
`containers` library: `Data.Containers.ListUtils`. This can also
potentially in the future add efficient list intersection, etc. as
documented on the above reference link.

The rationale for the new module is that it can provide a meaningful
home for such functions which operate on lists, but require other data
structures to be implemented efficiently...

Discussion period: 2 weeks.

--Gershom
_______________________________________________
Libraries mailing list
Libraries at haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/libraries/attachments/20171016/1b1ca7a8/attachment.html>


More information about the Libraries mailing list