Pure functional TypeRep [Was: Existentials...]

oleg@pobox.com oleg@pobox.com
Tue, 29 Jul 2003 23:05:09 -0700 (PDT)


This message shows how to map _types_ to integers at compile time --
and then extend this facility to run-time so it can be used with
existentially-quantified types. It is statically guaranteed that
different types receive different integer labels. Unlike TyCons of
Dynamics, our mapping does NOT rely on unsafePerformIO. We use the
typechecker itself to compare types. It seems our typemap can also be
used for _safe_ casts, in particular, for casting away the existential
blemish.

This message was inspired by Amr Sabry's problem on
existentials. Unlike Hal Daume's solution posted here earlier, we do
not use Dynamics nor GMap -- therefore, we completely avoid
unsafePerformIO and unsafeCoerce. Our solution is purely functional
and assuredly type-safe (to the extent the typechecker can be
trusted). Extensions used:
  -fglasgow-exts -fallow-undecidable-instances -fallow-overlapping-instances

First, the compile-time mapping from types to integers.

Polytypic list:

> data Nil t r  = Nil
> data Cons t r = Cons t r
>
> class PList ntype vtype cdrtype where
>     cdr::   ntype vtype cdrtype -> cdrtype
>     empty:: ntype vtype cdrtype -> Bool
>     value:: ntype vtype cdrtype -> vtype
>    
> instance PList Nil vtype cdrtype where
>     empty = const True
>
> instance (PList n v r) => PList Cons v' (n v r) where
>      empty = const False
>      value (Cons v r) = v
>      cdr   (Cons v r) = r

and the finite map from types to integers

> class TypeSeq t s where
>     type_index:: t -> s -> Int
>
> instance (PList Cons t r) => TypeSeq t (Cons t r) where
>     type_index _ _ = 0
>    
> instance (PList Cons t' r', TypeSeq t r') => TypeSeq t (Cons t' r') where
>     type_index v s = 1 + (type_index v $ cdr s)

Let us build some finite map

> init_typeseq = Cons (undefined::Char) $
>                Cons (undefined::Int) $
> 	         Cons (undefined::Bool) $
> 	         Cons (undefined::String) $
> 	         Cons (undefined::Maybe Char) $
> 	         Nil

A typeseq maps a type to an integer -- which is the type's position in
the polytypic list. We are satisfied with the first occurrence of the
type in the list. It is plain that a typeseq maps different types to
different integers. Let us see how it works:

-- *Main> type_index True  init_typeseq
-- 2
-- *Main> type_index 'a'  init_typeseq
-- 0
-- *Main> type_index "a"  init_typeseq
-- 3
-- *Main> type_index (1::Int)  init_typeseq
-- 1
-- *Main> type_index (Just 'a')  init_typeseq
-- 4
-- *Main> type_index (Just True)  init_typeseq

-- <interactive>:1:
--     No instance for (TypeSeq (Maybe Bool) (Nil t r))
--     arising from use of `type_index' at <interactive>:1
--     In the definition of `it': type_index (Just True) init_typeseq

We call this mapping compile-time because the translation from a type
to the corresponding encoding (1+1+...1) is done at compile time, when
the appropriate instance of the TypeSeq class is chosen. A
sufficiently smart compiler can just as well fold the constant
expression (1+1...+1) down to a single integer. Also, errors in the
translation are reported at compile-time.

However, to apply the mapping to existential types, we need to extend
the former to run-time.

> class TypeRep t where
>     tr_index:: t -> Int
>
> -- All of the following are almost the same.
> -- The Template Haskell can really help us here
> instance TypeRep Char where
>     tr_index x = type_index x init_typeseq
>
> instance TypeRep Int where
>     tr_index x = type_index x init_typeseq
>
> instance TypeRep Bool where
>     tr_index x = type_index x init_typeseq
>
> instance TypeRep String where
>     tr_index x = type_index x init_typeseq
>
> instance TypeRep (Maybe Char) where
>     tr_index x = type_index x init_typeseq

Now, Amr Sabry's Problem

> data F a b = 
>               forall c. (TypeRep c) => PushF (a -> c) (F c b) 
>             | Bottom (a -> b)
>
> f1 :: Char -> Bool
> f1 'a' = True
> f1 _ = False
>
> f2 :: Bool -> String
> f2 True = "true"
> f2 False = "false"
>
> f3 :: String -> Int
> f3 = length
>
> fs = PushF f1 (PushF f2 (PushF f3 (Bottom id)))

First, we make fs an instance of class Show. It is jolly convenient.

> data HF = forall a b. (TypeRep a,TypeRep b) => HF (F a b)
> show_fn_type (g::(a->b)) 
> 	= "(" ++ (show (tr_index (undefined::a))) ++ 
> 	  "->"++(show (tr_index (undefined::b))) ++ ")"
>	
> instance (TypeRep a, TypeRep b) => Show (F a b) where
>     show = show . hsf_to_lst . HF
>       where
>        hsf_to_lst (HF (Bottom g)) = [show_fn_type g]
> 	 hsf_to_lst (HF (PushF g next)) = (show_fn_type g):(hsf_to_lst$HF next)

Now, Amr Sabry's question:

] Is it possible to write a function 
]   f :: F a b -> T c -> F c b
] where (T c) is some type for values of type 'c' or values representing
] the type 'c' or whatever is appropriate. Thus if given the
] representation of Bool, the function should return:
]  PushF f2 (PushF f3 (Bottom id))
] and if given the representation of String the function should return
]  PushF f3 (Bottom id)
] and so on. 

First, we prepend to the given F a b structure the identity
composition. We see the meaning of it below.

> f fs v = f' (PushF id fs) v

> f':: (TypeRep c) => (F a b) -> c -> F a b
> f' here@(PushF g next) (v::tv) = 
>     if tr_index v == tr_index (g undefined)
>     then here
>     else case next of
>           PushF g1 next' -> f' (PushF (g1.g) next') v
>           Bottom g1      -> f' (Bottom (g1.g)) v

The function f' takes a data structure F a b and a value of type c and
returns another data structure F a b, but of a different structure
	PushF g next
here 'next' has the type F c b -- which is the answer to Amr Sambry's
question. The function g::a->c is a composition of all previously
occurring functions. This is a neat "side-effect" of the function f':
we partially compose the given F a b structure up to the given type.

Let us see how it works:

     *Main> fs
     ["(0->2)","(2->3)","(3->1)","(1->1)"]
     *Main> f fs 'a'
     ["(0->0)","(0->2)","(2->3)","(3->1)","(1->1)"]
     *Main> f fs True
     ["(0->2)","(2->3)","(3->1)","(1->1)"]
     *Main> f fs "a" 
     ["(0->3)","(3->1)","(1->1)"]
     *Main> f fs (1::Int)
     ["(0->1)","(1->1)"]

Because f is capable of producing any partial composition, it can do
the full composition just as easily:

> flatten:: (TypeRep a, TypeRep b) => F a b -> (a -> b)
> flatten fs :: (a->b)
>     = case f fs (undefined::b) of
>         PushF g (Bottom g1) -> g1.g

And it indeed works:

*Main> flatten fs 'a'
4 -- the length of the word true
*Main> flatten fs 'b'
5 -- the length of the word false

Again, I can't seem to write any Haskell code that doesn't explicitly
use undefined.