Proposal: add ordNub somewhere in containers

David Feuer david.feuer at gmail.com
Wed Oct 18 21:40:10 UTC 2017


The trouble with NonEmpty is that I don't like the idea of users having to
roll their own. They'd end up looking like

case ordNub (toList xs) of
  [] -> error "can't happen!"
  x : xs -> x :| xs

I hate "can't happen" errors!

The case for sequences is less compelling with fusion, for sure, but it
seems a bit strange to leave them out.

On Oct 18, 2017 5:28 PM, "Gershom B" <gershomb at gmail.com> wrote:

> I agree that `ordNubOn` suffices and we don't need `ordNubBy` since
> there's nothing lawful you can do with the latter that you can't do
> with the former. I'm indifferent on the NonEmpty and Seq case as I
> don't suspect that they will yield much more efficient implementations
> than going via lists, especially if we setup (and we should!) the
> fusion rules correctly. I have no objection to adding them for
> completeness however.
>
> If we do add them, then the proposed module name
> `Data.Containers.ListUtils` becomes slightly less appropriate, but I
> think still fine, since these are "morally" all lists of various
> sorts.
>
> -g
>
>
> On Wed, Oct 18, 2017 at 1:49 PM, David Feuer <david.feuer at gmail.com>
> wrote:
> > I am convinced that we should add
> >
> > ordNub :: Ord a => [a] -> [a]
> > ordNubOn :: Ord b => (a -> b) -> [a] -> [b]
> > intNub :: [Int] -> [Int]
> > intNubOn :: (a -> Int) -> [a] -> [a]
> >
> > And because nub preserves non-emptiness, I believe we should also offer
> >
> > ordNub1 :: Ord a => NonEmpty a -> NonEmpty a
> > ordNubOn1 :: Ord b => (a -> b) -> NonEmpty a -> NonEmpty a
> > intNub1 :: NonEmpty Int -> NonEmpty Int
> > intNubOn1 :: (a -> Int) -> NonEmpty a -> NonEmpty a
> >
> > I imagine we should also add these operations for Data.Sequence.Seq.
> >
> > I'm not yet convinced that we should add
> >
> > ordNubBy :: (a -> a -> Ordering) -> [a] -> [a]
> >
> > but I'm open to further discussion of that question. My main concern is
> that
> > the properties of the comparison argument require careful documentation.
> In
> > its favor, using it improperly cannot *expose* a broken Set to later
> > operations.
> >
> > I would very much like to hear further bikeshedding around names and
> > namespaces.
> >
> > On Oct 16, 2017 6:18 PM, "Gershom B" <gershomb at gmail.com> wrote:
> >>
> >> There have been many discussions over the years about adding an
> >> efficient order preserving nub somewhere to our core libraries. It
> >> always comes down to the same issue: an efficient nub wants to be
> >> backed by an efficient `Set`, but the API of the `nub` itself doesn't
> >> make reference to any other data structures besides lists. So it feels
> >> a bit conceptually strange to put an efficient nub anywhere besides
> >> `Data.List` even though it can't go there without inverting our
> >> dependency tree in a weird way or inlining an efficient set
> >> implementation into the middle of it.
> >>
> >> Nonetheless, the convenience of having a good `nub` lying around in a
> >> core library is undeniable, and after writing the "usual" one in my
> >> code for the zillionth time, I decided to raise an issue about it:
> >>
> >> https://github.com/haskell/containers/issues/439
> >>
> >> I was promptly directed here to make a proper proposal.
> >>
> >> So, here:
> >>
> >> 1) I propose two new functions,
> >>
> >> `ordNub` and `intNub`
> >>
> >> with the standard implementation (from
> >> https://github.com/nh2/haskell-ordnub):
> >>
> >> import qualified Data.Set as Set
> >>
> >> ordNub :: (Ord a) => [a] -> [a]
> >> ordNub l = go Set.empty l
> >>   where
> >>     go _ [] = []
> >>     go s (x:xs) = if x `Set.member` s then go s xs
> >>                                       else x : go (Set.insert x s) xs
> >>
> >> and the same implementation, but specialized to `Int` and using
> `IntSet`s.
> >>
> >> The rationale for the names is that the former has a long history of
> >> use in folklore, and the latter is the obvious specialization of it.
> >>
> >> 2) I propose these functions be added to a new module in the
> >> `containers` library: `Data.Containers.ListUtils`. This can also
> >> potentially in the future add efficient list intersection, etc. as
> >> documented on the above reference link.
> >>
> >> The rationale for the new module is that it can provide a meaningful
> >> home for such functions which operate on lists, but require other data
> >> structures to be implemented efficiently...
> >>
> >> Discussion period: 2 weeks.
> >>
> >> --Gershom
> >> _______________________________________________
> >> Libraries mailing list
> >> Libraries at haskell.org
> >> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/libraries/attachments/20171018/41e70709/attachment-0001.html>


More information about the Libraries mailing list