[trac@galois.com: Re: [GHC] #1218: Add sortNub and sortNubBy to
Data.List]
John Meacham
john at repetae.net
Wed Mar 21 17:06:39 EDT 2007
On Mon, Mar 19, 2007 at 11:22:47AM +1100, Duncan Coutts wrote:
> On Mon, 2007-03-19 at 11:02 +1100, Donald Bruce Stewart wrote:
> > I propose we *do not* change the api to add the special case:
> >
> > sortNub = sort . nub
> > = map head . group . sort
> >
> > and instead add a rewrite rule to Data.List to provide this
> > optimisation.
> >
> > {-# RULES
> > "sort/nub" sort . nub = map head . group . sort
> > #-}
>
> Though of course we'd want to use the more efficient implementation than
> that. One can write a sort that directly eliminates duplicates, and I
> think you'd want a rule for nub . sort as well as sort . nub.
>
> And while we're at it we can add a rule to rewrite all the places people
> have used map head . group . sort (or the goupBy f . sortBy f
> equivalents) to this equals-eliminating sort implementation.
note, this is most likely not what you want at all:
> > {-# RULES
> > "sort/nub" sort . nub = map head . group . sort
> > #-}
desugars to
> > {-# RULES
> > "sort/nub" (.) sort nub = map head . group . sort
> > #-}
meaning you are actually adding a RULE to (.), not sort or nub. this
hasa couple of unfortunate consequences.
* people will have to use 'sort . nub' rather than something like sort $ nub xs
* (.) will no longer be inlined until much later in the optimization
passes, which perhaps could have disasterous consequences
that said, I do not believe this is an appropriate use of rules to begin
with, it changes the asymptotic complexity, so should be available as an
explicit option, and is just a generally useful routine. not that rules
can't also exist, but attached to the right thing.
(sort (nub xs)) = nubSort xs
(nub (sort xs)) = nubSort xs
now, I would also like to see (and it might turn out to be more useful),
fastNub, which is a nub that uses Ord and sets to be just as fast, but
works on infinite lists. accidentally evaluating a whole list in memory
is the quickest way to a space leak and 'sort' is the most common
culprit. this is why 'nub' can often be _faster_ than sortNub when you
only partially evaluate a list, with ordNub, you get the benefits of
both!
John
--
John Meacham - ⑆repetae.net⑆john⑈
More information about the Libraries
mailing list