[Haskell-cafe] ordNub

Mon Aug 19 13:13:35 CEST 2013

> Richard A. O'Keefe <ok <at> cs.otago.ac.nz> writes:
> 
> There are at least four different things that "an Ord version" might
> mean:
> 
>  - first sort a list, then eliminate duplicates
>  - sort a list eliminating duplicates stably as you go
>    (think 'merge sort', using 'union' instead of 'merge')
>  - build a balanced tree set as you go
>  - having a list that is already sorted, use that to
>    eliminated duplicates cheaply.
> 
> These things have different costs.  For example, ...
> 
> What I want is more often ordNubBy than ordNub, though.
> 

(ordNubBy you can get via a suitable Ord instance for the element type?)

The bigger problem is that you might not have a suitable Ord instance. 
After all, sets are defined by equality/equivalence relation, not 
necessarily by Ord.

There are many other things you might want to do with a set/collection 
than just remove duplicates.

I notice that Data.Set.map = fromList . (map stuff) . toList
That is, build two lists (to be GC'd), as well as the set result. 

So does the performane cost of from/to List outrun the benefit of 
Data.Set.union? Depends how much you're mapping vs inserting and checking 
membership.