Faster `elem` "Hack" for Data.Set?
David Feuer
david.feuer at gmail.com
Sun Jan 30 00:59:51 UTC 2022
The tighter `Foldable` instance probably wouldn't hurt too much in
practice. I have some other concerns, however. One is that
`containers` has a long history of attempting to remain compatible
with potential future Haskell 98-like Haskell implementations (likely
with a few days of work to patch up mistakes when the time comes).
Using a GADT to define Set would break that quite thoroughly. My
second concern is more practical:
type Set a = SetImpl a a
is all sorts of problematic. It forces anyone who wants to use what's
always been the `Set` type constructor (Set :: Type -> Type) to dig
into the implementation and use `SetImpl :: Type -> Type -> Type`
instead. That's quite a different beast. I don't know all the things
that will break, but they'll surely include
https://hackage.haskell.org/package/constrained-monads-0.5.0.0/docs/Control-Monad-Constrained.html
. I think the real answer is to remove `elem` from `Foldable` and put
it somewhere more appropriate.
On Sat, Jan 29, 2022 at 7:26 PM Viktor Dukhovni <ietf-dane at dukhovni.org> wrote:
>
>
> In a recent "Folding the unfoldable" "gist":
>
> https://oleg.fi/gists/posts/2022-01-25-folding-unfoldable.html
>
> Oleg describes a way to make unboxed Vectors Foldable:
>
> There is another way to make Foldable work, with a
>
> data Hack a b where
> Hack :: U.Vector a -> Hack a a
>
> This is a two type-parameter wrapper, but the types are always the
> same! (I wish that could be a newtype). The Foldable instance is
> simply:
>
> instance U.Unbox a => Foldable (Hack a) where
> foldr f z (Hack v) = U.foldr f z v
> foldl' f z (Hack v) = U.foldl' f z v
>
> In the associated Reddit thread
>
> https://www.reddit.com/r/haskell/comments/sd6gel/comment/hudmsis/?utm_source=share&utm_medium=web2x&context=3
>
> it was observed that a similar approach can be used to give Data.Set a
> performant `elem` method (code copied below my signature).
>
> It is somewhat tempting to consider whether the "Hack" ought to be
> built-in directly into the real Data.Set:
>
> https://www.reddit.com/r/haskell/comments/sd6gel/comment/huil070/?utm_source=share&utm_medium=web2x&context=3
>
> type Set a = SetImpl a a
> data SetImpl a b where
> Bin :: {-# UNPACK #-} !Size -> !a -> !(SetImpl a) -> !(SetImpl a) -> SetImpl a a
> Tip :: SetImpl a a
> type Size = Int
>
> instance Ord a => Foldable (SetImpl a) where
> ...
> elem _ Tip = False
> elem x (Bin _ y l r) = case compare x y of
> EQ -> True
> LT -> elem x l
> GT -> elem x r
> ...
>
> This representation does not seem to carry any obvious runtime overhead,
> and gives Set a performant `elem` method. The only change from the
> status quo would be that folds would not be available for empty or
> singleton sets with non-Ord elements (currently possible with Set).
>
> How much of a loss would it be to constrain the element type of empty
> and singleton sets used in folds? Are there other issue that make the
> above impractical?
>
> --
> Viktor.
>
> type OSet a = OrdSet a a
> data OrdSet a b where
> OSet :: Set.Set a -> OrdSet a a
>
> instance Ord a => Foldable (OrdSet a) where
> fold (OSet xs) = fold xs
> foldMap f (OSet xs) = foldMap f xs
> foldMap' f (OSet xs) = foldMap' f xs
> foldr f z (OSet xs) = foldr f z xs
> foldr' f z (OSet xs) = foldr' f z xs
> foldl f z (OSet xs) = foldl f z xs
> foldl' f z (OSet xs) = foldl' f z xs
> foldr1 f (OSet xs) = foldr1 f xs
> foldl1 f (OSet xs) = foldl1 f xs
> toList (OSet xs) = toList xs
> null (OSet xs) = null xs
> length (OSet xs) = length xs
> -- The point of the exercise is `elem`
> elem e (OSet xs) = Set.member e xs
> maximum (OSet xs) = maximum xs
> minimum (OSet xs) = minimum xs
> sum (OSet xs) = sum xs
> product (OSet xs) = product xs
>
> instance Eq a => Eq (OSet a) where
> (OSet xs) == (OSet ys) = xs == ys
>
> instance (Eq a, Ord a) => Ord (OSet a) where
> compare (OSet xs) (OSet ys) = compare xs ys
>
> empty :: Ord a => OSet a
> empty = OSet Set.empty
>
> singleton :: Ord a => a -> OSet a
> singleton = OSet . Set.singleton
>
> fromList :: Ord a => [a] -> OSet a
> fromList = OSet . Set.fromList
>
> (\\) :: Ord a => OSet a -> OSet a -> OSet a
> (OSet xs) \\ (OSet ys) = OSet (xs Set.\\ ys)
>
> alterF :: (Ord a, Functor f) => (Bool -> f Bool) -> a -> OSet a -> f (OSet a)
> alterF f x (OSet xs) = fmap OSet $ Set.alterF f x xs
>
> ...
> _______________________________________________
> Libraries mailing list
> Libraries at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
More information about the Libraries
mailing list