[Haskell-cafe] Developing Programs and Proofs Spontaneously using GADT

Sat Aug 4 11:25:21 EDT 2007

I am curious about the possibility of developing Haskell programs
spontaneously with proofs about their properties and have the
type checker verify the proofs for us, in a way one would do in
a dependently typed language.

In the exercise below, I tried to redo part of the merge-sort
example in Altenkirch, McBride, and McKinna's introduction to
Epigram [1]: deal the input list into a binary tree, and fold
the tree by the function merging two sorted lists into one.
The property I am going to show is merely that the length of
the input list is preserved.

Given that dependent types and GADTs are such popular topics,
I believe the same must have been done before, and there may be
better ways to do it. If so, please give me some comments or
references. Any comments are welcomed.

 > {-# OPTIONS_GHC -fglasgow-exts #-}

To begin with, we define the usual type-level representation
of natural numbers and lists indexed by their lengths.

 > data Z = Z       deriving Show
 > data S a = S a   deriving Show

 > data List a n where
 >   Nil :: List a Z
 >   Cons :: a -> List a n -> List a (S n)

1. Append

To warm up, let us see the familiar "append" example.
Unfortunately, unlike Omega, Haskell does not provide type
functions. I am not sure which is the best way to
emulate type functions. One possibility is to introduce
the following GADT:

 > data Plus m n k where    --- m + n = k
 >   PlusZ :: Plus Z n n
 >   PlusS :: Plus m n k -> Plus (S m) n (S k)

such that Plus m n k represents a proof that m + n = k.

Not having type functions, one of the possible ways to do
append is to have the function, given two lists of lengths
m and n, return a list of length k and a proof that m + n = k.
Thus, the type of append would be:

   append :: List a m -> List a n
               -> exists k. (List a k, Plus m n k)

In Haskell, the existential quantifier is mimicked by forall.
We define:

 > data DepSum a p = forall i . DepSum (a i) (p i)

The term "dependent sum" is borrowed from the Omega tutorial
of Sheard, Hook, and Linger [2] (why is it called a sum,
not a product?).

The function append can thus be defined as:

 > append :: List a m -> List a n -> DepSum (List a) (Plus m n)
 > append Nil ys = DepSum ys PlusZ
 > append (Cons x xs) ys =
 >    case (append xs ys) of
 >      DepSum zs p -> DepSum (Cons x zs) (PlusS p)

Another possibility is to provide append a proof that m + n = k.
The type and definition of of append would be:

< append :: Plus m n k -> List a m -> List a n -> List a k
< append PlusZ Nil ys = ys
< append (PlusS pf) (Cons x xs) ys = Cons x (append pf xs ys)

I thought the second append would be more difficult to
use: to append two lists, I have to provide a proof about
their lengths! It turns out that this append actually composes
easier with other parts of the program. We will come to this
later.

2. Some Lemmas

Here are some lemmas represented as functions on terms. The
function, for example, converts a proof of m + (1+n) = k
to a proof of (1+m) + n = k.

 > incAssocL :: Plus m (S n) k -> Plus (S m) n k
 > incAssocL PlusZ = PlusS PlusZ
 > incAssocL (PlusS p) = PlusS (incAssocL p)

 > incAssocR :: Plus (S m) n k -> Plus m (S n) k
 > incAssocR (PlusS p) = plusMono p

 > plusMono :: Plus m n k -> Plus m (S n) (S k)
 > plusMono PlusZ = PlusZ
 > plusMono (PlusS p) = PlusS (plusMono p)

For example, the following function revcat performs list
reversal by an accumulating parameter. The invariant we
maintain is m + n = k. To prove that the invariant holds,
we have to use incAssocL.

 > revcat :: List a m -> List a n -> DepSum (List a) (Plus m n)
 > revcat Nil ys = DepSum ys PlusZ
 > revcat (Cons x xs) ys =
 >     case revcat xs (Cons x ys) of
 >       DepSum zs p -> DepSum zs (incAssocL p)

3. Merge

Apart from the proof manipulations, the function merge
is not very different from what one would expect:

 > merge :: Ord a => List a m -> List a n -> DepSum (List a) (Plus m n)
 > merge Nil ys = DepSum ys PlusZ
 > merge (Cons x xs) Nil = append (Cons x xs) Nil
 > merge (Cons x xs) (Cons y ys)
 >   | x <= y = case merge xs (Cons y ys) of
 >                 DepSum zs p -> DepSum (Cons x zs) (PlusS p)
 >   | otherwise = case merge (Cons x xs) ys of
 >                   DepSum zs p -> DepSum (Cons y zs) (plusMono p)

The lemma plusMono is used to convert a proof of
m + n = k to a proof of m + (1+n) = 1+k.

4. Sized Trees

We also index binary trees by their sizes:

 > data Tree a n where
 >   Nul :: Tree a Z
 >   Tip :: a -> Tree a (S Z)
 >   Bin :: Tree a n1 -> Tree a n ->
 >             (Plus p n n1, Plus n1 n k) -> Tree a k

The two trees given to the constructor Bin have sizes n1 and n
respectively. The resulting tree, of size k, comes with a proof
that n1 + n = k. Furthermore, we want to maintain an invariant
that n1 either equals n, or is bigger than n by one. This is
represented by the proof Plus p n n1. In the definition of
insertT later, p is either PlusZ or PlusS PlusZ.

5. Lists to Trees

The function insertT inserts an element into a tree:

 > insertT :: a -> Tree a n -> Tree a (S n)
 > insertT x Nul = Tip x
 > insertT x (Tip y) = Bin (Tip x) (Tip y) (PlusZ, PlusS PlusZ)
 > insertT x (Bin t u (PlusZ, p)) =
 >      Bin (insertT x t) u (PlusS PlusZ, PlusS p)
 > insertT x (Bin t u (PlusS PlusZ, p)) =
 >      Bin t (insertT x u) (PlusZ, PlusS (incAssocR p))

Note that whenever we construct a tree using Bin, the
first proof, corresponding to the difference in size of
the two subtrees, is either PlusZ or PlusS PlusZ.

The counterpart of foldr on indexed list is defined by:

 > foldrd :: (forall k . (a -> b k -> b (S k))) -> b Z
 >               -> List a n -> b n
 > foldrd f e Nil = e
 > foldrd f e (Cons x xs) = f x (foldrd f e xs)

The result is also an indexed type (b n).

The function deal :: List a n -> Tree a n, building a
tree out of a list, can be defined as a fold:

 > deal :: List a n -> Tree a n
 > deal = foldrd insertT Nul

6. Trees to Lists, and Merge Sort

The next step is to fold through the tree by the function
merge. The first two clauses are simple:

 > mergeT :: Ord a => Tree a n -> List a n
 > mergeT Nul = Nil
 > mergeT (Tip x) = Cons x Nil

For the third clause, one would wish that we could write
something as simple as:

   mergeT (Bin t u (_,p1)) =
      case merge (mergeT t) (mergeT u) of
        DepSum xs p -> xs

However, this does not type check. Assume that t has size
n1, and u has size n. The DepSum returned by merge consists
of a list of size i, and a proof p of type Plus m n i, for
some i. The proof p1, on the other hand, is of type P m n k
for some k. Haskell does not know that Plus m n is actually
a function and cannot conclude that i=k.

To explicitly state the equality, we assume that there is
a function plusFn which, given a proof of m + n = i and
a proof of m + n = k, yields a function converting an i
in any context to a k. That is:

   plusFn :: Plus m n i -> Plus m n k
               -> (forall f . f i -> f k)

The last clause of mergeT can be written as:

 > mergeT (Bin t u (_,p1)) =
 >    case merge (mergeT t) (mergeT u) of
 >      DepSum xs p -> plusFn p p1 xs

How do I define plusFn? I would like to employ the techniques
related to equality types [3,4,5], but currently I have not
yet figured out how. I've merely produced a version of
plusFn specialised to List a:

 > plusFn :: Plus m n h -> Plus m n k -> List a h -> List a k
 > plusFn PlusZ PlusZ xs = xs
 > plusFn (PlusS p1) (PlusS p2) (Cons x xs) =
 >         Cons x (plusFn p1 p2 xs)

Needless to say this is not satisfactory.

Now that we have both deal and mergeT, merge sort is simply
their composition:

 > msort :: Ord a => List a n -> List a n
 > msort = mergeT . deal

The function mergeT can be defined using a fold on trees
as well. Such a fold might probably look like this:

 > foldTd :: (forall m n k . Plus m n k -> b m -> b n -> b k)
 >           -> (a -> b (S Z)) -> b Z
 >           -> Tree a n -> b n
 > foldTd f g e Nul = e
 > foldTd f g e (Tip x) = g x
 > foldTd f g e (Bin t u (_,p)) =
 >    f p (foldTd f g e t) (foldTd f g e u)

   mergeT :: Ord a => Tree a n -> List a n
   mergeT = foldTd merge' (\x -> Cons x Nil) Nil
    where merge' p1 xs ys =
            case merge xs ys of
              DepSum xs p -> plusFn p p1 xs

I am not sure whether this is a "reasonable" type for
foldTd.

7. Passing in the Proof as an Argument

Previously I thought the second definition of append would be
more difficult to use, because we will have to construct a
proof about the lengths before calling append. In the context
above, however, it may actually be more appropriate to use
this style of definitions.

An alternative definition of merge taking a proof as an
argument can be defined by:

< merge :: Ord a => Plus m n k -> List a m ->
<                       List a n -> List a k
< merge PlusZ Nil ys = ys
< merge pf (Cons x xs) Nil = append pf (Cons x xs) Nil
< merge (PlusS p) (Cons x xs) (Cons y ys)
<   | x <= y    = Cons x (merge p xs (Cons y ys))
<   | otherwise = Cons y (merge (incAssocL p) (Cons x xs) ys)

A definition of mergeT using this definition of merge follows
immediately because we can simply use the proof coming with
the tree:

< mergeT :: Ord a => Tree a n -> List a n
< mergeT Nul = Nil
< mergeT (Tip x) = Cons x Nil
< mergeT (Bin t u (_,p1)) =
<   merge p1 (mergeT t) (mergeT u)

I don't know which approach can be called more "natural".

References

[1] Thorsten Altenkirch, Conor McBride, and James McKinna.
     Why Dependent Types Matter.
     http://www.e-pig.org/downloads/ydtm.pdf

[2] Tim Sheard, James Hook, and Nathan Linger.
     GADTs + Extensible Kinds = Dependent Programming.
     http://web.cecs.pdx.edu/~sheard/papers/GADT+ExtKinds.ps

[3] James Cheney, Ralf Hinze. A lightweight implementation
     of generics and dynamics.

[4] Stephanie Weirich, Type-safe cast: (functional pearl),
     ICFP 2000.

[5] Arthur I. Baars , S. Doaitse Swierstra, Typing dynamic
     typing, ACM SIGPLAN Notices, v.37 n.9, p.157-166,
     September 2002

Appendix. Auxiliary Functions

 > instance Show a => Show (List a n) where
 >   showsPrec _ Nil = ("[]"++)
 >   showsPrec _ (Cons x xs) = shows x . (':':) . shows xs

 > instance Show (Plus m n k) where
 >   showsPrec _ PlusZ = ("pZ"++)
 >   showsPrec p (PlusS pf) = showParen (p>=10) (("pS " ++) .
 >                              showsPrec 10 pf)

 > instance Show a => Show (Tree a n) where
 >   showsPrec _ Nul = ("Nul"++)
 >   showsPrec p (Tip x) = showParen (p >= 10) (("Tip " ++) . shows x)
 >   showsPrec p (Bin t u pf) =
 >     showParen (p>=10)
 >      (("Bin "++) . showsPrec 10 t . (' ':) . showsPrec 10 u .
 >       (' ':) . showsPrec 10 pf)