Safe and sound STRef [Was Implementing RefMonads in Haskell without ST,IO]

Thu, 5 Jun 2003 02:06:48 -0500

Oleg,

This is a very neat solution to providing coercion-free references in a
local environment.

I'm trying to generalize this to some sort of Monad-like typeclass, where
there is a nice mapping from Monad's to this new and more powerful
typeclass, so that one can combine typed references, IO, etc., into a single
framework.

It seems to me that your approach couldn't be generalized in this way in
Haskell, because the type of the resulting reference-using "computation"
depends on the precise set of heap operations performed there.  So, for
example, you couldn't do something like:

    ..
    a <- newRef Int 123
    b <- if (some conditional) then (newRef Int) else (a)
    ..

Because the heap types propagated out of the conditional's two branches
differ.

The only way I can see generalizing your technique to support this sort of
thing requires a type system supporting both dependent types and subtyping.
Think the reference monad as looking somewhat like the state monad; instead
of a single piece of state, it propagates a dependent-typed pair of
(heapTypeFunc,heapValueFunc) similar in spirit to your PList construct, with
the type guaranteeing that any heap operation returns an output heapTypeFunc
that's a contravariant extension of its input heapTypeFunc.

Conjecture: Implementing type-safe (coercion-free), composable computations
over typed references isn't possible in Haskell.  By composable I mean that
some operator similar in spirit to >>= on Monads can be implemented and that
computations with differing effects can occur in if-branches.

But then again, I had not thought the problem you solved using PList to be
solveable in Haskell, and am very eager to be proven wrong!

-Tim

----- Original Message -----
Back in September 2001, Koen Claessen wrote:
] Here is a little experiment I did a while ago. I was trying to isolate
] the capability of the ST monad to deal with different types at the
] same time....  I conjecture this functionality cannot be implemented
] in Haskell 98, nor in any of the known safe extensions of Haskell.

Recently, Tim Sweeney wrote
] Is it possible to actually implement a working instance of RefMonad in
] Haskell, without making use of a built-in monad like IO or ST?

The following code shows a safe and sound implementation of a
polymorphic heap with references and updates. The heap is capable of
storing of polymorphic, functional and IO values. All operations are
*statically* checked. An attempt to alter a heap reference with a
value of a mismatched type leads to a _compile-time_ error. Everything
is implemented in Haskell98 + multiparameter classes with functional
dependencies + overlapping instances. I suspect that the latter isn't
strictly needed, but it's almost midnight. Most importantly, no IO or
ST monad, no unsafePerformIO or unsafeCoerce, no existential types and
no Dynamics are needed. It seems that the polymorphic updateable heap
can be implemented safely. Although the code looks like a monadic
code, the Monad class doesn't seem to be polymorphic enough to wrap
our heap. Perhaps arrows will help.

I'd like to remark first that the ST monad with polymorphic STRef can
be implemented in Haskell98 + safe extensions _provided_ all the
values question are in the class Show/Read. When we store values,
we store their external representation. When we retrieve a value, we
read it. Similarly for values in the Binary class. All such values are
_safely_ coercible. The following code however does not make this
assumption. In our heap below, we can even store polymorphic
functions and IO values!

First, the tags for values in our heap

> data Zero
> data Succ a
>
> class HeapTag a where
>     tag_value:: a -> Int
>
> instance HeapTag Zero where
>     tag_value _ = 0
>			-- I just can't avoid the undefined arithmetics
> instance (HeapTag t) => HeapTag (Succ t) where
>     tag_value _ = 1 + tag_value (undefined::t)

The heap reference is the combination of the tag and the desired
type. As we will see, the value of the second argument of the HeapRef
doesn't actually matter -- only its type does.

> data HeapRef t a = HeapRef t a

Our heap is implemented as a polymorphic associative list

> data Nil t v r  = Nil
> data Cons t v r = Cons t v r
>
> class PList ntype ttype vtype cdrtype where
>     cdr::   ntype ttype vtype cdrtype -> cdrtype
>     empty:: ntype ttype vtype cdrtype -> Bool
>     value:: ntype ttype vtype cdrtype -> vtype
>     tag::   ntype ttype vtype cdrtype -> ttype
>
> instance PList Nil ttype vtype cdrtype where
>     empty = const True
>
> instance (PList n t v r,HeapTag tag) => PList Cons tag vtype (n t v r)
where
>      empty = const False
>      value (Cons t v r) = v
>      tag   (Cons t v r) = t
>      cdr   (Cons t v r) = r

The following is the statically typed polymorphic heap itself:

> class Heap t v h | t h -> v where
>     fetch:: (HeapRef t v) -> h -> v
>     alter:: (HeapRef t v) -> v -> h -> h
>
> instance (HeapTag t,PList Cons t v r) => Heap t v (Cons t v r) where
>     fetch _ p = value p
>     alter _ vnew (Cons t v r) = (Cons t vnew r)
>
> instance (HeapTag t,Heap t v r,PList Cons t' v' r) =>
> 	   Heap t v (Cons t' v' r) where
>     fetch ref p = fetch ref $ cdr p
>     alter ref vnew (Cons t v r) = Cons t v $ alter ref vnew r

Let's make our heap instances of a class Show, so we have something to show.

> instance (PList Nil ttype vtype cdrtype) => Show (Nil ttype vtype cdrtype)
>    where
>     show _ = "[]"

> instance (Show vtype, HeapTag ttype, PList Cons ttype vtype cdrtype,
>           Show cdrtype) =>
>  	  Show (Cons ttype vtype cdrtype)
>   where
>     show x = (show $ (tag_value $ tag x, value x)) ++ " : " ++ (show $ cdr
x)

Let's take a few simple examples

> tinc (x::t) = undefined::(Succ t)
> tag_one = (undefined::(Succ Zero))
> acons = Cons

> lst1 = acons tag_one 'a' $ Nil
> lst2 = acons (tinc tag_one) 'a' $ acons tag_one True $ Nil
> test1 = fetch (HeapRef tag_one 'z') lst1

The result is 'a'. We see that the value of the second argument of
HeapRef doesn't actually matter. But its type sure does: if we
uncomment the following line

> -- test1' = fetch (HeapRef tag_one (1::Int)) lst1

we get an error:

/tmp/o1.lhs:127:
    Couldn't match `Char' against `Int'
        Expected type: Char
        Inferred type: Int
    When using functional dependencies to combine
      Heap t v (Cons t v r),
        arising from the instance declaration at /tmp/o1.lhs:91
      Heap (Succ Zero) Int (Cons (Succ Zero) Char (Nil t v r)),
        arising from use of `fetch' at /tmp/o1.lhs:127
    When generalising the type(s) for `test1''

Indeed, the stored value is of type Char and we try to read it as an
Int.

More tests:

> test21 = fetch (HeapRef tag_one False) lst2
> -- the latter test again makes sure that fetching is type safe
> --test22' = fetch (HeapRef (tinc tag_one) False) lst2
> test22 = fetch (HeapRef (tinc tag_one) (undefined::Char)) lst2

Testing alternation:

> testa1 = alter (HeapRef tag_one (undefined::Bool)) False lst2
> --gives: (2,'a') : (1,False) : []
> testa2 = alter (HeapRef (tinc tag_one) (undefined::Char)) 'y' lst2
> --gives: (2,'y') : (1,True) : []

Now we can bundle our heap with an allocation pointer

> -- p is the heap, t is the next tag to allocate in p
> data GHeap t p = GHeap t p
>
> instance (HeapTag t,Show p) => Show (GHeap t p) where
>     show (GHeap t p) = "Global heap: alloc ptr " ++ (show$tag_value t) ++
>                        "\n" ++ (show p)

> init_gh = GHeap (undefined::Zero) Nil
> fetch_gh ref (GHeap _ p) = fetch ref p
> alloc_gh x (GHeap t p) = (HeapRef t x,GHeap (tinc t) (Cons t x p))
> alter_gh ref newval (GHeap t p) = GHeap t $ alter ref newval p

And finally the big test. The test shows storing and altering regular
values, polymorphic function values and even IO values.

> test3 = do
>	let heap = init_gh
>       let (xref,heap1) = alloc_gh 'a' heap
> 	let (yref,heap2) = alloc_gh [(1::Int),2,3] heap1
> 	let (zref,heap3) = alloc_gh (Just False) heap2
> 	putStrLn "\nAfter allocations"
> 	print heap3
>
> 	putStr "x is "; print $ fetch_gh xref heap3
> 	putStr "y is "; print $ fetch_gh yref heap3
> 	putStr "z is "; print $ fetch_gh zref heap3
>
> 	let heap31 = alter_gh xref 'z' heap3
> 	let heap32 = alter_gh yref [] heap31
> 	let heap33 = alter_gh zref (Just True) heap32
> 	putStrLn "\nAfter updates"
> 	print heap33
> 	putStr "x is "; print $ fetch_gh xref heap33
> 	putStr "y is "; print $ fetch_gh yref heap33
> 	putStr "z is "; print $ fetch_gh zref heap33
>
> 	putStrLn "\nPolymorphic and IO values"
> 	let (gref,heap4) = alloc_gh (\x->x+1) heap33
> 	let (href,heap5) = alloc_gh id heap4
> 	let (mref,heap7) = alloc_gh (putStrLn "Monad!") heap5
> 	putStr "g 1 is "; print $ (fetch_gh gref heap4) (1::Int)
> 	putStr "h True is "; print $ (fetch_gh href heap5) True
> 	putStr "h 'a' is "; print $ (fetch_gh href heap5) 'a'
> 	putStr "m is "; (fetch_gh mref heap7)
>
> 	let heap71 = alter_gh mref (putStrLn "New Monad") heap7
> 	let heap72 = alter_gh gref (\x->x+5) heap71
> 	putStrLn "\nAfter updates to polymorphic and IO values"
> 	putStr "g 1 is "; print $ (fetch_gh gref heap72) (1::Int)
> 	putStr "m is "; (fetch_gh mref heap72)
> 	return ()

I had many occasions to see that an attempt to retrieve the value of a
wrong type or change the value with that of a wrong type result in a
compiler error! Although we are "altering" polymorphic values, the type
safeness seems to be preserved

> test4 = do let heap = init_gh
>            let (xref,heap1) = alloc_gh [] heap
>            let (yref,heap2) = alloc_gh [(1::Int)] heap
>            let (zref,heap3) = alloc_gh [True] heap
> 	     let heap11 = alter_gh xref [(1::Int)] heap1
> 	     --let z = fetch_gh zref heap11
> 	     let z = fetch_gh zref heap1
> 	     print z

If we uncomment the commented line, we will get a type error!

The lines like
*> 	   let (yref,heap2) = alloc_gh [(1::Int),2,3] heap1
*> 	   let (zref,heap3) = alloc_gh (Just False) heap2

almost suggest a monad. Alas, the Monad class doesn't seem to be
polymorphic enough. The type of the >>= is
	forall m b a. (Monad m) => m a -> (a -> m b) -> m b
although 'b' at the end doesn't have to be the same as 'a' at the
beginning, 'm' must be the same in the argument of >>= and in its
result. However, in our example, heap2 has a type different from that
of heap1.

ghci -fglasgow-exts -fallow-undecidable-instances
     -fallow-overlapping-instances /tmp/o1.lhs