[Haskell] Fixed-length vectors in Haskell, Part 1: Using GADTs

Sat May 7 21:36:33 EDT 2005

I wrote this up a few days ago and thought I'd share it with the list.
It's yet another implementation of fixed-length vectors (that is, lists
which reflect their length in their type). It's a nice demonstration of
GADTs and an unexpected use of a technique Ralf Hinze describes in
"Generics for the Masses".

--

This is the first of two modules implementing fixed-length vectors in
Haskell. This one uses Generalized Algebraic Datatypes (GADTs), a recent
extension currently only available in GHC 6.4. The second will implement
the same functions using no extensions to Haskell 98.

This message is literate Haskell. To use, save it as "Vector_GADT.lhs".

> {-#OPTIONS -fglasgow-exts #-}
> module Vector_GADT where

(Note: The Arrow Transformer Library[1] includes a class for
"Sequences"[2], which lie between functors and monads in terms of power.
If you have the arrows package installed, change the lines beginning
with "*>" to ">" to use the instance declared here.)

    [1] <http://haskell.org/arrows/download.html>
    [2] <http://haskell.org/arrows/arrows/Control.Sequence.html>

*> import Control.Sequence

We want to create a type for vectors where the length of the vector is
reflected in its type. We start by defining type-level naturals:

> data Zero
> data Succ n

These don't need data constructors because they will only appear as
phantom types.

Some aliases for convenience:

> type One   = Succ Zero
> type Two   = Succ One
> type Three = Succ Two
> type Four  = Succ Three

Next, the vectors themselves:

> data Vec n a where
>     Nil  ::                 Vec Zero a
>     Cons :: a -> Vec n a -> Vec (Succ n) a

The first argument to Vec represents the length of the vector. Nil
constructs empty vectors (length zero), and Cons adds an element to
a vector of length n, creating one of length n+1.

It's possible to declare head and tail as total functions by indicating
with their types that they require vectors with non-zero lengths:

> vhead :: Vec (Succ n) a -> a
> vhead (Cons x xs) = x
>
> vtail :: Vec (Succ n) a -> Vec n a
> vtail (Cons x xs) = xs

Similarly, we can declare a version of zipWith which requires its vector
arguments to have the same length:

> vzipWith :: (a -> b -> c) -> Vec n a -> Vec n b -> Vec n c
> vzipWith f Nil         Nil         = Nil
> vzipWith f (Cons x xs) (Cons y ys) = Cons (f x y) (vzipWith f xs ys)

Some other useful functions:

> vfoldr :: (a -> b -> b) -> b -> Vec n a -> b
> vfoldr f z Nil         = z
> vfoldr f z (Cons x xs) = f x (vfoldr f z xs)
>
> toList :: Vec n a -> [a]
> toList = vfoldr (:) []
>
> inner :: Num a => Vec n a -> Vec n a -> a
> inner x y = vfoldr (+) 0 (vzipWith (*) x y)

Note that we can't type "vfoldr Cons Nil", because the result type of
Cons does not match the second argument.

Here's an instance for Functor.

> instance Functor (Vec n) where
>     fmap f Nil         = Nil
>     fmap f (Cons x xs) = Cons (f x) (fmap f xs)

We would like a function "vec :: a -> Vec n a", such that "vec x"
returns a vector of length n with every element equal to x. The basic
algorithm is simple:

    vec{0}   x = Nil
    vec{n+1} x = Cons x (vec{n} x)

The usual way to declare a function like this involves a type class,
which lets us define functions with different behaviors for different
types:

    class Nat n where vec :: a -> Vec n a

    instance          Nat Zero     where vec x = Nil
    instance Nat n => Nat (Succ n) where vec x = Cons x (vec x)

This gives vec the type "Nat n => a -> Vec n a", which is good because
it points out that the type argument to Vec must be a natural, formed
from Zero or Succ applied to another natural.

However, this solution is unsatisfactory, because it requires the Nat
class to include every possible function which might depend on a
type-level argument.

The solution is a technique Ralf Hinze describes in "Generics for the
Masses"[3]: create another class for functions which depend on the type
parameter.

    [3] <http://www.informatik.uni-bonn.de/~ralf/publications.html#P21>

Thus:

> class Nat n where
>     natCase :: NatCase g => g n
>
> class NatCase g where
>     caseZero ::          g Zero
>     caseSucc :: Nat n => g (Succ n)
>
> instance          Nat Zero     where natCase = caseZero
> instance Nat n => Nat (Succ n) where natCase = caseSucc

If you're unfamiliar with Mr Hinze's technique, this may seem pretty
opaque. The idea is to give each function that depends on a natural
type parameter a distinct type, using newtype, and then give the
member functions for caseZero and caseSucc as the cases where the
parameter is Zero and Succ n, respectively.

For example, here is the code for vec:

> newtype MkVec a n = MkVec { runMkVec :: a -> Vec n a }
>
> vec :: Nat n => a -> Vec n a
> vec = runMkVec natCase
>
> instance NatCase (MkVec a) where
>     caseZero = MkVec (\x -> Nil)
>     caseSucc = MkVec (\x -> Cons x (vec x))

Here's another function, which converts a list to a vector, provided
the list is long enough:

> newtype FromList a n = FromList
>     { runFromList :: [a] -> Maybe (Vec n a) }
>
> fromList :: Nat n => [a] -> Maybe (Vec n a)
> fromList = runFromList natCase
>
> instance NatCase (FromList a) where
>     caseZero = FromList (\l -> Just Nil)
>     caseSucc = FromList (\l -> case l of
>                    x:xs -> fmap (Cons x) (fromList xs)
>                    []   -> Nothing
>                )

WIth this technique we can create as many functions which have
type-specific behavior for naturals as we need, without having to create
new classes or member functions.

With vec, we can make Vec n an instance of Sequence, from the Arrow
Transformer Library:

*> instance Nat n => Sequence (Vec n) where
*>     lift0 = vec
*>     lift2 = vzipWith

In fact, we can go further and make Vec n a monad. Since we already have
fmap, we just need an equivalent to join; that is, a function that takes
a vector of vectors, and returns a vector. One possibility is to take
the nth element of the nth vector, like so:

> diag :: Vec n (Vec n a) -> Vec n a
> diag Nil         = Nil
> diag (Cons x xs) = Cons (vhead x) (diag (fmap vtail xs))
>
> instance Nat n => Monad (Vec n) where
>     return  = vec
>     m >>= k = diag (fmap k m)

This might make the Sequence instance seem superfluous, since 'lift0'
and 'lift2' correspond to 'return' and 'liftM2', respectively, but
'lift2' is able to take advantage of the vector's structure in a way
that 'liftM2' cannot. Thus, lift2 is O(n), whereas liftM2 has three
calls to diag, which is O(n^2). ('fmap' is better than 'liftM' for
similar reasons.)

More interestingly, we can make Vec n a an instance of Num:

> instance Show a => Show (Vec n a) where show = show . toList
> 
> instance Eq a => Eq (Vec n a) where
>     Nil       == Nil       = True
>     Cons x xs == Cons y ys = x == y && xs == ys
>
>     -- alternately, x == y = vfoldr (&&) True (vzipWith (==) x y)
>
> instance (Nat n, Num a) => Num (Vec n a) where
>     (+) = vzipWith (+)
>     (*) = vzipWith (*)
>     (-) = vzipWith (-)
>     negate      = fmap negate
>     abs         = fmap abs
>     signum      = fmap signum
>     fromInteger = vec . fromInteger

This gives us element-wise arithmetic on equal-sized vectors and
scalar arithmetic.

    *Vector_GADT> let v = Cons 1 (Cons 2 (Cons 3 (Cons 4 Nil)))
    *Vector_GADT> v
    [1,2,3,4]
    *Vector_GADT> 3 + v
    [4,5,6,7]
    *Vector_GADT> 3 * v
    [3,6,9,12]
    *Vector_GADT> v + v
    [2,4,6,8]
    *Vector_GADT> v * v
    [1,4,9,16]

Using fixed-length vectors, we can create matrices which are guaranteed
to be rectangular.

> type Matrix m n a = Vec m (Vec n a)
>
> mat :: (Nat m, Nat n) => a -> Matrix m n a
> mat x = vec (vec x)
>
> fromLists :: (Nat m, Nat n) => [[a]] -> Maybe (Matrix m n a)
> fromLists xs = mapM fromList xs >>= fromList
>
> row :: Vec n a -> Matrix One n a
> row v = Cons v Nil
>
> col :: Vec n a -> Matrix n One a
> col = fmap (\x -> Cons x Nil)

Because matrices are just vectors of vectors, the old instances for
Eq, Show, and Num still apply.

    *Vector_GADT>
        let Just (m::Matrix Two Two Int) = fromLists [[1,2],[3,4]]
    *Vector_GADT> m
    [[1,2],[3,4]]
    *Vector_GADT> m + 1
    [[2,3],[4,5]]
    *Vector_GADT> m * m + m == m * (m + 1)
    True

We can define functions to transpose and multiply matrices that
guarantee the dimensions will work out.

> transpose :: Nat n => Matrix m n a -> Matrix n m a
> transpose Nil         = vec Nil
> transpose (Cons x xs) = vzipWith Cons x (transpose xs)
>
> mult :: (Nat n, Num a) => Matrix k m a -> Matrix m n a -> Matrix k n a
> mult x y = fmap (\r -> fmap (inner r) y') x
>     where y' = transpose y

    *Vector_GADT> row v `mult` col v
    [[30]]
    *Vector_GADT> col v `mult` row v
    [[1,2,3,4],[2,4,6,8],[3,6,9,12],[4,8,12,16]]

We can even define a zero-argument function that creates an identity
matrix of the appropriate size.

> newtype IdMat a n = IdMat { runIdMat :: Matrix n n a }
>
> idMat :: (Nat n, Num a) => Matrix n n a
> idMat = runIdMat natCase
>
> instance Num a => NatCase (IdMat a) where
>     caseZero = IdMat Nil
>     caseSucc = IdMat (Cons (Cons 1 (vec 0)) (fmap (Cons 0) idMat))

    *Vector_GADT> idMat :: Matrix Four Four Int
    [[1,0,0,0],[0,1,0,0],[0,0,1,0],[0,0,0,1]]

    *Vector_GADT> idMat :: Matrix Two Three Int

    <interactive>:1:0:
        Couldn't match `One' against `Zero'
          Expected type: Matrix Two Three Int
          Inferred type: Matrix Two Two a
        In the expression: idMat :: Matrix Two Three Int
        In the definition of `it': it = idMat :: Matrix Two Three Int
-- 
David Menendez <zednenem at psualum.com> <http://www.eyrie.org/~zednenem/>