New extension in GHC: newtype deriving clause

Fri, 21 Dec 2001 17:04:05 +0100 (MET)

Simon PJ and I have come up with an extension to newtype deriving clauses
which is intended to make it easier to make types abstract.

Suppose you are about to write a type definition. You face a choice between
defining it as a type synonym, and making it a new, potentially abstract
type. Often a newtype may be the "right" choice, but that may require
substantial extra work. Specifically, if the representation type already has a
number of useful class instances, which you also want on the newtype, then a
type synonym gives you them for free -- but a newtype definition forces you to
write an instance declaration for each instance you want. These instance
declarations are boring to write: they just strip off and put back the newtype
constructor here and there, and call the representation instance. Especially
galling is that in the implementation, the newtype constructor doesn't even
appear -- so all your work is "for nothing", the dictionary for the
representation type would actually work perfectly well for the new type
also. The work of defining these instances may be enough to put you off making
the new type abstract.

The new extension allows you to derive these instances, just by adding the
classes to the deriving clause of the newtype declaration. It's implemented in
GHC already and will be part of the next release. I've appended the manual
entry describing the extension to this message: that contains examples and a
careful definition of when an instance can be derived.

I'll just remark that you can also use this extension to emulate Hugs'
restricted type synonyms. Suppose you want to implement an abstract type of
sets as lists, and define the set operations without needing to constantly add
and remove a newtype constructor. You can do so as follows:

1. Define a class containing the operations inside the abstraction barrier.
   This corresponds to giving the type signatures of these functions in the
   Hugs construct. 

      class SetLike s where
        empty :: s a
	single :: a -> s a
	union :: s a -> s a -> s a
	member :: Eq a => a -> s a -> Bool

   (This isn't supposed to be a general Collection class: we're just defining
   the types of function on sets-as-lists),

2. Define an instance for the representation type. Of course, there's no need
   to mess with any newtype constructor here.

      instance SetLike [] where
        empty = []
	single x = [x]
	union = (++)
	member = elem

3. Define a new Set type, and derive a class instance for it.

      newtype Set a = Set [a] deriving SetLike

That's it. Of course, the Set operations are now overloaded, which has its own
penalties, but nevertheless we managed to define operations on an abstract
type without explicitly worrying about the constructor.

Here's the manual entry:

Extended deriving clause for newtype
====================================

When you define an abstract type using newtype, you may want the new type to
inherit some instances from its representation. In Haskell 98, you can inherit
instances of Eq, Ord, Enum and Bounded by deriving them, but for any other
classes you have to write an explicit instance declaration. For example, if
you define

newtype Dollars = Dollars Int

and you want to use arithmetic on Dollars, you have to explicitly define an
instance of Num:

instance Num Dollars where
  Dollars a + Dollars b = Dollars (a+b)
  ...

All the instance does is apply and remove the newtype constructor. It is
particularly galling that, since the constructor doesn't appear at run-time,
this instance declaration defines a dictionary which is wholly equivalent to
the Int dictionary, only slower!

GHC permits such instances to be derived instead, so one can write

newtype Dollars = Dollars Int deriving (Eq,Show,Num)

and the implementation uses the same Num dictionary for Dollars as for
Int. Notionally, the compiler derives an instance declaration of the form

instance Num Int => Num Dollars

which just adds or removes the newtype constructor according to the type.

We can also derive instances of constructor classes in a similar way. For
example, suppose we have implemented state and failure monad transformers,
such that

instance Monad m => Monad (State s m)
instance Monad m => Monad (Failure m)

In Haskell 98, we can define a parsing monad by

type Parser tok m a = State [tok] (Failure m) a

which is automatically a monad thanks to the instance declarations above. With
the extension, we can make the parser type abstract, without needing to write
an instance of class Monad, via

newtype Parser tok m a = Parser (State [tok] (Failure m) a)
  deriving Monad

In this case the derived instance declaration is of the form

instance Monad (State [tok] (Failure m)) => Monad (Parser tok m)

Notice that, since Monad is a constructor class, the instance is a partial
application of the new type, not the entire left hand side. We can imagine
that the type declaration is ``eta-converted'' to generate the context of the
instance declaration.

We can even derive instances of multi-parameter classes, provided the newtype
is the last class parameter. In this case, a ``partial application'' of the
class appears in the deriving clause. For example, given the class

class StateMonad s m | m -> s where ...
instance Monad m => StateMonad s (State s m) where ...

then we can derive an instance of StateMonad for Parsers by

newtype Parser tok m a = Parser (State [tok] (Failure m) a)
  deriving (Monad, StateMonad [tok])

The derived instance is obtained by completing the application of the class to
the new type:

instance StateMonad [tok] (State [tok] (Failure m)) =>
           StateMonad [tok] (Parser tok m)

As a result of this extension, all derived instances in newtype declarations
are treated uniformly (and implemented just by reusing the dictionary for the
representation type), except Show and Read, which really behave differently
for the newtype and its representation.

Derived instance declarations are constructed as follows. Consider the
declaration (after expansion of any type synonyms)

newtype T v1...vn = T' (TC t1...tp vk+1...vn) deriving (c1...cm)

where TC is a type constructor, t1...tp are types, vk+1...vn are type
variables which do not occur in any of the ti, and the ci are partial
applications of classes of the form C t1'...tj'.  The derived instance
declarations are, for each ci,

instance ci (TC t1...tp vk+1...vl) => ci (T v1...vl)

where l is chosen so that T v1...vl is of the right kind for the last
parameter of class Ci.

As an example which does not work, consider

newtype NonMonad m s = NonMonad (State s m s) deriving Monad

Here we cannot derive the instance

instance Monad (State s m) => Monad (NonMonad m)

because the type variable s occurs in State s m, and so cannot be
``eta-converted'' away. It is a good thing that this deriving clause is
rejected, because NonMonad m is not, in fact, a monad --- for the same
reason. Try defining >>= with the correct type: you won't be able to.

Notice also that the order of class parameters becomes important, since we can
only derive instances for the last one. If the StateMonad class above were
instead defined as

class StateMonad m s | m -> s where ...

then we would not have been able to derive an instance for the Parser
type above. We hypothesise that multi-parameter classes usually have one
``main'' parameter for which deriving new instances is most interesting.