[Haskell] extensible records using associated types
Barney Hilken
b.hilken at ntlworld.com
Mon Jun 19 18:40:11 EDT 2006
I'm sorry if this example has already been posted, but I couldn't
find it by searching the archives. If it's new (and works properly),
it provides more evidence that ATs are a good thing.
You can use associated type synonyms to implement polymorphic
extensible records.
------------------------------------------------------------------------
---------
We will represent records as values of the form
N1 x1 (N2 x2 ... (Nn xn Empty)...)
with corresponding type
N1 a1 (N2 a2 ... (Nn an Empty)...)
where the field name N is defined by the declaration
data N a r = N a r
For convenience, we define the type of field name constructors:
> type Constructor n = forall a. forall r. a -> r -> n a r
so that N :: Constructor N for each field name N.
Next we define a class (:<:) whose purpose is to tell the compiler in
which order to store fields:
> class (m :: * -> * -> *) :<: (n :: * -> * -> *)
The idea is that if N and M are distinct field names which might
appear the same record, then there must be an instance of exactly one
of N :<: M and M :<: N.
The class Contains n r asserts that r is a record type with a field
labelled n:
> class Contains n r where
> type Project n r
> type Delete n r
> project :: Constructor n -> r -> Project n r
> delete :: Constructor n -> r -> Delete n r
There are two associated types: Project n r is the type of the n
field, and Delete n r is the type of the rest of the record. The
functions project and delete give the corresponding values.
The class Lacks n r asserts that r is a record type without a field
labelled n:
> class Lacks n r where
> type Extend n a r
> extend :: Constructor n -> a -> r -> Extend n a r
There is one associated type: Extend n a r is the type of the record
obtained by adding a field of type a labelled n. Note that according
to the AT paper, the parameter a should come last, but this order is
much more convenient. The function extend adds the new field to the
record.
The class Disjoint r s asserts that r and s are record types with no
fields in common:
> class Disjoint r s where
> type Union r s
> union :: r -> s -> Union r s
There is one associated type: Union r s is the type of the merged
record, and union is the merge function.
Next we define the empty record:
> data Empty = Empty
It has no fields, so has no instances of Contains, but lots of
instances of Lacks:
> instance Lacks n Empty where
> type Extend n a Empty = n a Empty
> extend nn x Empty = nn x Empty
It is also disjoint from everything:
> instance Disjoint Empty r where
> type Union Empty r = r
> union Empty t = t
Each field name must be defined as a data type together with
instances of the classes Contains, Lacks and Disjoint as follows:
> data N a r = N a r
The type N a r certainly contains a field named N:
> instance Contains N (N a r) where
> type Project N (N a r) = a
> type Delete N (N a r) = r
> project N (N x t) = x
> delete N (N x t) = t
It also Contains all the fields of r, though you only need to check
names which are below N:
> instance m :<: N, Contains m r => Contains m (N a r) where
> type Project m (N a r) = Project m r
> type Delete m (N a r) = N a (Delete m r)
> project mm (N x t) = project mm t
> delete mm (N x t) = N x (delete mm t)
Similarly, it Lacks all the fields which r Lacks, except for N
itself. This is where we really need the ordering :<: to ensure that
m is not equal to N. There are two cases, m :<: N:
> instance m :<: N, Lacks m r => Lacks m (N a r) where
> type Extend m b (N a r) = N a (Extend m b r)
> extend mm y (N x t) = N x (extend mm y t)
and N :<: m:
> instance N :<: m => Lacks m (N a r) where
> type Extend m b (N a r) = m b (N a r)
> extend mm y (N x t) = mm y (N x t)
To ensure that N a r is disjoint from s, it is enough that r is
disjoint from s and N is not in their union:
> instance Disjoint r s, Lacks N (Union r s) => Disjoint (N a r) s
where
> type Union (N a r) s = Extend N a (Union r s)
> union (N x t) u = extend N x (union t u)
Finally, we must maintain the ordering :<:. It has to be a global
linear order, so if we know the last field name was M, we can define:
> instance M :<: N
> instance m :<: M => m :<: N
Usage.
------
The record {N1 = x1, ... Nn = xn} should be constructed as (extend N1
x1 $ ... $ extend Nn xn) Empty.
Extension, projection, deletion and union are all as polymorphic as
you could want, and you can even use field names as ordinary values.
For example, you can define update by:
> update :: Contains n r, Lacks n (Delete n r) => Constructor n -> a
-> r -> Extend n a (Delete n r)
> update n y t = extend n y (delete n t)
Problems.
---------
1. Pattern matching.
Pattern matching on complete records is possible (though you must
get the fields in the right order!) but there is no equivalent to the
trex .. notation for 'any other fields'.
2. Maintaining the :<: order.
Within one module this could be generated with a template, but
between modules this is more of a problem. Of course, we could use a
built-in class based on alphabetical order, but this would need
compiler support.
3. Controlling the use of constructors.
If you use N to construct records and types directly, instead of
Extend N and extend N, you get fields in the wrong order and
everything breaks. The type system can't stop you doing this.
4. Testing.
I don't actually know if this works, because phrac won't compile
it. We really need the second-order polymorphism of field
constructors, and phrac doesn't have forall. (There is also the
trivial point of the order of arguments to Extend).
Barney.
More information about the Haskell
mailing list