Faster `elem` "Hack" for Data.Set?
Viktor Dukhovni
ietf-dane at dukhovni.org
Sun Jan 30 00:25:16 UTC 2022
In a recent "Folding the unfoldable" "gist":
https://oleg.fi/gists/posts/2022-01-25-folding-unfoldable.html
Oleg describes a way to make unboxed Vectors Foldable:
There is another way to make Foldable work, with a
data Hack a b where
Hack :: U.Vector a -> Hack a a
This is a two type-parameter wrapper, but the types are always the
same! (I wish that could be a newtype). The Foldable instance is
simply:
instance U.Unbox a => Foldable (Hack a) where
foldr f z (Hack v) = U.foldr f z v
foldl' f z (Hack v) = U.foldl' f z v
In the associated Reddit thread
https://www.reddit.com/r/haskell/comments/sd6gel/comment/hudmsis/?utm_source=share&utm_medium=web2x&context=3
it was observed that a similar approach can be used to give Data.Set a
performant `elem` method (code copied below my signature).
It is somewhat tempting to consider whether the "Hack" ought to be
built-in directly into the real Data.Set:
https://www.reddit.com/r/haskell/comments/sd6gel/comment/huil070/?utm_source=share&utm_medium=web2x&context=3
type Set a = SetImpl a a
data SetImpl a b where
Bin :: {-# UNPACK #-} !Size -> !a -> !(SetImpl a) -> !(SetImpl a) -> SetImpl a a
Tip :: SetImpl a a
type Size = Int
instance Ord a => Foldable (SetImpl a) where
...
elem _ Tip = False
elem x (Bin _ y l r) = case compare x y of
EQ -> True
LT -> elem x l
GT -> elem x r
...
This representation does not seem to carry any obvious runtime overhead,
and gives Set a performant `elem` method. The only change from the
status quo would be that folds would not be available for empty or
singleton sets with non-Ord elements (currently possible with Set).
How much of a loss would it be to constrain the element type of empty
and singleton sets used in folds? Are there other issue that make the
above impractical?
--
Viktor.
type OSet a = OrdSet a a
data OrdSet a b where
OSet :: Set.Set a -> OrdSet a a
instance Ord a => Foldable (OrdSet a) where
fold (OSet xs) = fold xs
foldMap f (OSet xs) = foldMap f xs
foldMap' f (OSet xs) = foldMap' f xs
foldr f z (OSet xs) = foldr f z xs
foldr' f z (OSet xs) = foldr' f z xs
foldl f z (OSet xs) = foldl f z xs
foldl' f z (OSet xs) = foldl' f z xs
foldr1 f (OSet xs) = foldr1 f xs
foldl1 f (OSet xs) = foldl1 f xs
toList (OSet xs) = toList xs
null (OSet xs) = null xs
length (OSet xs) = length xs
-- The point of the exercise is `elem`
elem e (OSet xs) = Set.member e xs
maximum (OSet xs) = maximum xs
minimum (OSet xs) = minimum xs
sum (OSet xs) = sum xs
product (OSet xs) = product xs
instance Eq a => Eq (OSet a) where
(OSet xs) == (OSet ys) = xs == ys
instance (Eq a, Ord a) => Ord (OSet a) where
compare (OSet xs) (OSet ys) = compare xs ys
empty :: Ord a => OSet a
empty = OSet Set.empty
singleton :: Ord a => a -> OSet a
singleton = OSet . Set.singleton
fromList :: Ord a => [a] -> OSet a
fromList = OSet . Set.fromList
(\\) :: Ord a => OSet a -> OSet a -> OSet a
(OSet xs) \\ (OSet ys) = OSet (xs Set.\\ ys)
alterF :: (Ord a, Functor f) => (Bool -> f Bool) -> a -> OSet a -> f (OSet a)
alterF f x (OSet xs) = fmap OSet $ Set.alterF f x xs
...
More information about the Libraries
mailing list