why is Data.Set not a Monad?

Sun May 6 23:56:35 EDT 2007

Hi Dan,

Thank you for your interesting report. I'm sorry that I did not figure
some of those things out myself.

I guess I'd thought that the primary use of Set would be to replace
lists for applications where order doesn't matter, or maybe where an
efficient 'union' is needed; but many of these (e.g. parsing) commonly
use a monadic interface, so it seems strange for Set not to have one.

But I can't suggest a solution better than the ones you have proposed.

Frederik

On Sun, May 06, 2007 at 04:00:09PM -0400, Dan Doel wrote:
> On Sunday 06 May 2007, Frederik Eaton wrote:
> > Hello,
> >
> > Anyone know why Data.Set is not a Monad? Or Data.Map?
> >
> > It seems Data.Map is a Functor, but not Data.Set...
> >
> > I am confused. Am I not importing the right modules? Thanks,
> 
> These questions all have slightly different answers, I think. Somewhat out of 
> order:
> 
> 1) Data.Map isn't a monad because it isn't one. Consider:
> 
>   return :: a -> Map k a
> 
> What key will return choose to inject a value into a map? The only option that 
> leaps to mind is:
> 
>   return a = singleton undefined a
> 
> But that's hardly useful.
> 
> 2) Data.Set is a monad, but you can't convince Haskell's type system of that 
> the way things are currently structured. This is because set operations 
> require elements to be instances of the Ord typeclass, and the Monad 
> typeclass signature doesn't allow for that. In the current typeclass, you 
> have:
> 
>   return :: a -> m a
> 
> While Set requires:
> 
>   return :: Ord a => a -> m a -- return = singleton
> 
> It is possible to structure things so that Set can be given a Monad instance, 
> but it may require some extensions. Here's one such way. Consider the 
> following typeclass:
> 
>   class Monad m a where
>     return :: a -> m a
>     (>>=) :: (Monad m b) => m a -> (a -> m b) -> m b
> 
> Here, 'm' is the monad type constructor, and 'a' will be the types it works 
> on. An instance is needed for each such allowable a. In Set's case, with 
> undecidable instances, this is easy:
> 
>   instance (Ord a) => Monad' Set a where
>     return a = singleton a
>     ...
> 
> The (Ord a) context is provided for return. However, bind is still a problem, 
> because the obvious definition:
> 
>   m >>= f = fold (union . f) empty m
> 	:: (Ord b) => Set a -> (a -> Set b) -> Set b
> 
> Has an (Ord b) context that we can't provide. One way (not the only one, I'm 
> sure) to solve this is to rebuild Set as a GADT, so that the (Ord b) context 
> is packaged with the set. This can be simulated by wrapping the existing set 
> (suppose the current set is imported qualified):
> 
>   data Set a where
>     Empty :: Set a
>     Wrap :: Ord a => Set a -> Set a
> 
>   singleton :: Ord a => a -> Set a
>   singleton a = Wrap (Set.singleton a)
> 
>   union :: Set a -> Set a -> Set a
>   union Empty t = t
>   union s Empty = s
>   union (Wrap s) (Wrap t) = Wrap (Set.union s t)
> 
>   fold :: (a -> b -> b) -> b -> Set a -> b
>   fold _ z Empty = z
>   fold f z (Wrap s) = Set.fold f z s
> 
> Now, since the GADT union doesn't require an Ord context, we can write:
> 
>   instance Ord a => Monad Set a where
>     return a = singleton a -- The same as before
>     s >>= f = fold (union . f) Empty s
> 
> So, we're finally at a 'valid' Monad instance for Set, and it only took us 
> multi-parameter type classes, undecidable instances, and GADTs. :) Existing 
> monads can be declared members of the revised class like so:
> 
>   instance Monad [] a where
>     return a = [a]
>     l >>= f = foldr ((++) . f) [] l
> 
> Simply not restricting the parameter 'a' leaves you with the case we currently 
> have.
> 
> Other approaches have been suggested, I think, but this is one. I'm not sure 
> it's an advisable road to take, as it's rather complicated, but it's an 
> option.
> 
> 3) As for Functors, it's easy to define an fmap operation for Map k v. You 
> take your function of type (v -> u) and apply it to each element, storing the 
> result at the same key. However, consider the type of the obvious fmap 
> implementation for Set a:
> 
>   fmap f s = fold (insert . f) empty s :: Ord b => (a -> b) -> Set a -> Set b
> 
> This requires an (Ord b) context that is absent from the Functor method's 
> signature, just as we ran into problems with the signatures in the current 
> Monad class. In fact, Set's 'map' function requires Ord contexts for a and b 
> both.
> 
> The problem here is that you can't use the same tricks as above to provide the 
> (Ord b) context for fmap via a GADT (insert still requires a provided 
> context). I suppose you could, of course, take *both* a and b as parameters 
> to the class, so that you could place constraints on both. The same thing 
> would work for Monad above, off the top of my head, and you could avoid the 
> GADT. However, I suspect if you start down a road of taking a parameter for 
> each distinct type variable in your method signatures, and having to declare 
> (whether explicitly or implicitly) n^m instances for a class, rather than 1, 
> things are going to get hairy.
> 
> Data.Set does provide a 'mapMonotonic' function of type:
> 
>   (a -> b) -> Set a -> Set b
> 
> which is the right type, but it appears to assume that the function respects 
> the ordering, such that:
> 
>   a1 `compare` a2 == f a1 `compare` f a2
> 
> or something like that. You could, therefore, pass in a function that doesn't 
> follow that, and up with a Set that isn't a set. Thus, it's unsuitable for 
> use as fmap.
> 
> Anyhow, I hope that made some sense at least, and answered some of your 
> questions. I'll attach a file that has a slightly more fleshed out 
> implementation of the GADT set wrapper, along with a monad instance, in case 
> you want to play with it. Some things are renamed a bit compared to the 
> above, since, for example, there already is a Monad typeclass.
> 
> Cheers.
> 
> -- Dan

> {-# OPTIONS -fglasgow-exts -fallow-undecidable-instances #-}
> 
> module GADTSet where
> 
> import qualified Data.Set as Set
> 
> data Set v where
>     Empty :: Set v
>     Wrap  :: Ord v => Set.Set v -> Set v
> 
> instance Show v => Show (Set v) where
>     show Empty = "Empty"
>     show (Wrap s) = show s
> 
> empty :: Set v
> empty = Empty
> 
> singleton :: Ord v => v -> Set v
> singleton v = Wrap $ Set.singleton v
> 
> insert :: Ord v => v -> Set v -> Set v
> insert v Empty    = Wrap $ Set.singleton v
> insert v (Wrap s) = Wrap $ Set.insert v s
> 
> union :: Set v -> Set v -> Set v
> union Empty t = t
> union s Empty = s
> union (Wrap s) (Wrap t) = Wrap $ Set.union s t
> 
> member :: v -> Set v -> Bool
> member _ Empty = False
> member v (Wrap s) = Set.member v s
> 
> fromList :: Ord v => [v] -> Set v
> fromList l = Wrap $ Set.fromList l
> 
> fold :: (a -> b -> b) -> b -> Set a -> b
> fold _ z Empty = z
> fold f z (Wrap s) = Set.fold f z s
> 
> class ExtMonad m a where
>     ret :: a -> m a
>     (>>>=) :: ExtMonad m b => m a -> (a -> m b) -> m b
> 
> instance Ord v => ExtMonad Set v where
>     ret = singleton
>     s >>>= f = fold (union . f) Empty s
> 
> instance ExtMonad [] a where
>     ret a = [a]
>     l >>>= f = foldr ((++) . f) [] l

> _______________________________________________
> Libraries mailing list
> Libraries at haskell.org
> http://www.haskell.org/mailman/listinfo/libraries

-- 
http://ofb.net/~frederik/