Deriving Functor

Thu Mar 8 15:00:51 EST 2007

Hello,

I would like to propose to add a way to automatically derive instances 
of Functor. From looking at existing code, it seems that almost all 
Functor instances I see are derivable using the algorithm presented 
here, resulting in less boilerplate code. This proposal is compatible 
with Haskell98 (and therefore also with Haskell').

Let's start with an example. The following declaration:

 > data Tree a = Leaf | Node (Tree a) a (Tree a)
 >      deriving Functor

would generate the following Functor instance:

 > instance Functor Tree where
 >      fmap f (Leaf      ) = Leaf
 >      fmap f (Node l a r) = Node (fmap f l) (f a) (fmap f r)

To be able to derive Functor in a general way, more classes are needed 
to support functors over other parameters:
 > class Functor2 f where  fmap2  :: (a -> b) -> f a x   -> f b x
 > class Functor3 f where  fmap3  :: (a -> b) -> f a x y -> f b x y
 > -- etc.
Provided instances would be:
 > instance Functor  ((,) a)  -- currently in Control.Monad.Instances
 > instance Functor2  (,)
 > instance Functor  ((,,) a b)
 > instance Functor2 ((,,) a)
 > instance Functor3  (,,)
 > instance Functor  ((,,,) a b c)
 > instance Functor2 ((,,,) a b)
 > instance Functor3 ((,,,) a)
 > instance Functor4  (,,,)
 > -- etc.

Also, a contravariant functors can come up:
 > class CoFunctor  f where  cofmap  :: (a -> b) -> f b   -> f a
 > class CoFunctor2 f where  cofmap2 :: (a -> b) -> f b x -> f a x
 > -- etc.

Now, to derive functor for a data type
 > data D a = C1 u v w | C2 x y z | ...
The instance would be:
 > instance Functor D where
 >    fmap f d = case d of
 >      C1 q r s -> C1 (fmap_<a,u> f q) (fmap_<a,v> f r) (fmap_<a,w> f s)
 >      C2 t u v -> C1 (fmap_<a,x> f t) (fmap_<a,y> f u) (fmap_<a,z> f v)
 >      ...
With the appropriate context. Here fmap_<a,b> is the deriving scheme to 
derive a functor over type b, parameterized by the type variable a:
 > fmap_<a, a>      f = f
 > fmap_<a, b>      f = id -- b does not contain a
 > fmap_<a, T x>    f = fmap  (fmap_<a,x> f)
 > fmap_<a, T x y>  f = fmap2 (fmap_<a,x> f) . fmap (fmap_<a,y> f)
 >   --etc.
 > fmap_<a, x -> y> f = \u -> fmap_<a, y> f . u . cofmap_<a, x> f
 >
 > cofmap_<a, b>      f = id -- b does not contain a
 > cofmap_<a, T x>    f = cofmap  (fmap_<a,x> f)
 > cofmap_<a, T x y>  f = cofmap2 (fmap_<a,x> f) . cofmap (fmap_<a,y> f)
 >   --etc.
 > cofmap_<a, x -> y> f = \u -> cofmap_<a, y> f . u . fmap_<a, x> f

Before type checking to determine the required instances, the 
transformations
    fmapN   id  --> id
    cofmapN id  --> id
must be applied. Otherwise unnecessary instances will be required, see 
the State example below.

Here are some examples of the deriving scheme. The derived instances are 
exactly as you would expect:

 > data Tree a = Leaf | Node (Tree a) a (Tree a)

The instance is derived as:
 > fmap f d = case d of
 >       Leaf       -> Leaf
 >       Node a b c -> Node (fmap_<a,Tree a> f   a) (fmap_<a,a> f b)
 >                          (fmap_<a,Tree a> f   c)
 >                   = Node (fmap (fmap_<a,a> f) a) (f            b)
 >                          (fmap (fmap_<a,a> f) c)
 >                   = Node (fmap f a) (f b) (fmap f c)

It also works for things like monad transformers:

 > newtype StateT s m a = StateT (s -> m (a, s))

 > fmap f (StateT a) = StateT b
 >  where b = fmap_<a, s -> m (a, s)> f a
 >          = fmap_<a, m (a, s)> f . a . cofmap_<a, s> f
 >          = fmap_<a, m (a, s)> f . a . id
 >          = fmap (fmap_<a, (a, s)> f) . a
 >          = fmap (fmap2 (fmap_<a, a> f) . fmap (fmap_<a, s> f)) . a
 >          = fmap (fmap2 f . fmap id) . a
 >          = fmap (fmap2 f) . a
 >          = \s -> fmap (\(a,s) -> (f a, s)) (a s)

Even for Cont:

 > newtype Cont r a = ContT ((a -> r) -> r)

 > fmap f (ContT a) = ContT b
 >  where b = fmap_<a, (a -> r) -> r> f a
 >          = fmap_<a, r> f . a . cofmap_<a, a -> r> f
 >          = id . a . cofmap_<a, a -> r> f
 >          = a . (\u -> cofmap_<a, r> f . u . fmap_<a,a> f)
 >          = a . (\u -> id . u . f)
 >          = a . (. f)

There are some (minor) problems with this approach. First of all the 
treatment of (->) is rather ad-hoc, consider:

 > newtype Arrow a b = a -> b    deriving (Functor, CoFunctor2)
 > data A a = A (T a -> ())      deriving Functor
 > data B a = B (Arrow (T a) ()) deriving Functor

In the first case the derived instance is:
 > instance CoFunctor T => Functor A where
 >     fmap f (A u) = A (u . cofmap f)
While for the second type the following is derived:
 > instance (Functor T, Functor2 Arrow) => Functor B where
 >     fmap f (B u) = fmap2 (fmap f)

Consider also:

 > newtype Problem a = Problem (T (U a)) deriving Functor

Now there are two possible functor instances, depending on the instances 
for T and U:

 > instance Functor Problem  where  fmap f = fmap (fmap f)
 > instance Functor Problem  where  fmap f = cofmap (cofmap f)

Currently the algorithm chooses the former, it will only use CoFunctor 
if (->) is present, and it tries to get rid of it as soon as possible. 
This also comes up when trying to derive an instance for this variation 
of Cont:

 > data C r a = C ((a -> r, a -> r) -> r) deriving Functor

Because it uses a type constructor in a contravariant position. The 
derivation goes as follows:

 > fmap f (C a) = C b
 >  where b = fmap_<a, (a -> r, a -> r) -> r>
 >          = fmap_<a, r> f . a . cofmap_<a, (a -> r, a -> r)> f
 >          = id . a . cofmap_<a, (a -> r, a -> r)> f
 >          = a . cofmap2 (fmap_<a, a->r> f) . cofmap (fmap_<a, a->r> f)
 >          = a . cofmap2 (fmap_<a, a->r> f)
 >              . cofmap (\u -> cofmap_<a, a> f . u . fmap_<a,r> f)
 >          = error, unable to realize: cofmap_<a,a>

The desired instance would be:
 >      ... = a . cofmap_<a, (a -> r, a -> r)> f
 >          = a . fmap2 (cofmap_<a, a -> r> f)
 >              . fmap (cofmap_<a, a -> r> f)
 >          = a . fmap2 (cofmap_<a, a -> r> f)
 >              . fmap (\u -> cofmap_<a,r> f . u . fmap_<a, a> f)
 >          = a . fmap2 (cofmap_<a, a -> r> f) . fmap (.f)
 >          = a . fmap2 (.f) . fmap (.f)
 >          = \(x,y) -> a . (x . f, y . f)

However, I highly doubt this problem will come up in practice. A 
'solution' would be to replace:
 > cofmap_<a, T x y>  f = cofmap2 (fmap_<a,x> f) . cofmap (fmap_<a,y> f)
with
 > cofmap_<a, T x y>  f = fmap2 (cofmap_<a,x> f) . fmap (cofmap_<a,y> f)
Thereby removing all uses of CoFunctor. Maybe that would be a better 
definition?

Finally, if Data.Foldable and Data.Traversable are added to the 
standard, they could be derived in a similair way.

Twan van Laarhoven