Proposal: add ordNub somewhere in containers
Herbert Valerio Riedel
hvriedel at gmail.com
Wed Oct 18 12:35:26 UTC 2017
On 2017-10-16 at 18:17:19 -0400, Gershom B wrote:
[...]
> 1) I propose two new functions,
>
> `ordNub` and `intNub`
>
> with the standard implementation (from https://github.com/nh2/haskell-ordnub):
>
> import qualified Data.Set as Set
>
> ordNub :: (Ord a) => [a] -> [a]
> ordNub l = go Set.empty l
> where
> go _ [] = []
> go s (x:xs) = if x `Set.member` s then go s xs
> else x : go (Set.insert x s) xs
>
> and the same implementation, but specialized to `Int` and using `IntSet`s.
I'm generally +1 for an 'ordNub'; I don't mind the particular color of
the shed.
While at it, I'd like to use the occasion to point out two minor things
with that specific implementation:
a) One thing I've been missing from `container's API are "UPSERT"-ish
operations; In the case of `Set`, the implementation above performs
a lookup, and then has to start the insertion from scratch.
If we had something like,
Set.tryInsert :: Ord a => a -> Set a -> (Set a, Bool)
or
Set.tryInsert :: Ord a => a -> Set a -> Maybe (Set a)
`ordNub` would benefit.
b) This is more of an observation, as there's not much we can do here
with little effort:
iirc, `nub` has a worst-case space complexity (if the input list is
made up of N distinct values, e.g. (`nub [1..n] :: [Int])) of 3N
words for the [a]-spine to keep track of the already seen values.
Using `Set` however trades time-complexity for memory-complexity,
and needs 5N words for the 'Set a'-spine.
c.f. http://blog.johantibell.com/2011/06/memory-footprints-of-some-common-data.html
-- hvr
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://mail.haskell.org/pipermail/libraries/attachments/20171018/b20cd08a/attachment.sig>
More information about the Libraries
mailing list