[Haskell] intent-typing

Tue Nov 16 05:59:15 EST 2010

Hi,

Marcus Sundman wrote:
> Hi, how would one go about implementing (or using if it's supported
> out-of-the-box) intent-typing* for haskell?

A basic technique is to use newtype declarations to declare separate 
types for separate intents.

   module StringSafety
            ( SafeString ()
            , UnsafeString ()
            , quote
            , considerUnsafe
            ) where

     newtype SafeString = SafeString String
     newtype UnsafeString = UnsafeString String

     considerUnsafe :: String -> UnsafeString
     considerUnsafe s = UnsafeString s

     quote :: UnsafeString -> SafeString
     quote (UnsafeString s) = SafeString s' where
       s' = ... s ...

This module does not export the SafeString and UnsafeString 
constructors, so we can be sure that no other code in the program can 
invent SafeStrings which are not really safe. Every string can be safely 
treated as unsafe, however, so we export a function considerUnsafe which 
does so.

Now, if we type our interface to the outside world as

   getInput      ::  ... -> UnsafeString
   sendOutput    ::  SafeString -> ...

we can be sure that a return value from getInput needs to pass through 
quote on its way to sendOutput, because quote is the only way to produce 
a SafeString.

This guarantuees safety. It has, however, a practical problem: We can't 
use the usual String functions on UnsafeString or SafeString values. For 
instance, we can't concatenate two UnsafeStrings using (++).

A naive solution would be to provide separate (++) functions for unsafe 
and safe strings:

   append_safe :: SafeString -> SafeString -> SafeString
   append_safe (SafeString x) (SafeString y)
     = SafeString (x ++ y)

   append_unsafe :: SafeString -> SafeString -> SafeString
   append_unsafe (UnsafeString x) (UnsafeString y)
     = UnsafeString (x ++ y)

Note that at least append_safe needs to be implemented in and exported 
from the StringSafety module. That is a good thing, because this 
function needs to be carefully checked for safety. The programmer needs 
to prove (or unit-test, or at least think about) the following theorem:

   If a and b are safe strings, so is a ++ b.

After this fact has been established, other modules are free to use 
append_safe however they like without possibly compromising safety.

Now, the above approach should work, but is still rather impractical: We 
need to copy the definitions of all String functions for unsafe and safe 
strings. However, since the bodies of all these copies are actually 
identical, so we can use parametric polymorphism to abstract over the 
difference between UnsafeString and SafeString. One way to achieve this 
is to use phantom types.

With phantom types, we declare only a single newtype for both safe and 
unsafe strings, but we annotate that type with an additional flag to 
distinguish safe from unsafe uses.

   module StringSafety
            ( AnnotatedString ()
            , Safe ()
            , Unsafe ()
            , quote
            , considerUnsafe
            , append
            ) where

     data Safe = Safe
     data Unsafe = Unsafe

     newtype AnnotatedString safety = AnnotatedString String

     considerUnsafe :: String -> AnnotatedString Unsafe
     considerUnsafe s = AnnotatedString s

     quote :: AnnotatedString Unsafe -> AnnotatedString Safe
     quote (AnnotatedString s) = AnnotatedString s' where
       s' = ... s ...

     append
       :: AnnotatedString a
       -> AnnotatedString a
       -> AnnotatedString a

     append (AnnotatedString x) (AnnotatedString y)
       = AnnotatedString (x ++ y)

Note that AnnotatedString does not really use its type parameter safety: 
That's why it is called a phantom type. The data constructor 
AnnotatedString can be freely used to convert between safe and unsafe 
strings, so we better not export it from the module. Inside the module, 
uses of the data constructor gives rise to proof obligations as above. 
So the programmer needs to reason that the following is true to justify 
the implementation and export of append:

   If x and y have the same safety level,
   then (x ++ y) has again that same safety level.

So now, we still have to write a wrapper around each string operation, 
but at least we need to write only one such wrapper for all intents, not 
a separate wrapper for each intent.

There is an inconvenience left: We can't concatenate safe and unsafe 
strings, because both arguments to append need to have exactly the same 
type. To fix this, we first have to figure out what the result of sucha 
concatenation would be: It would be an unsafe string, because at least 
one of the inputs is unsafe. We need to teach this kind of reasoning to 
the compiler, for instance, using type families:

   type family Join a b
   type instance Join Safe Safe = Safe
   type instance Join Safe Unsafe = Unsafe
   type instance Join Unsafe Safe = Unsafe
   type instance Join Unsafe Unsafe = Unsafe

The idea is that (Join a b) is the safety level of the result of an 
operation which consumes data of safety level a and b. I called the type 
family Join because the safety levels form a lattice, with unsafe levels 
higher then safe levels, and the type family computes the join operation 
for that lattice. So we are encoding a simple static analysis into the 
type system here!

Using this lattice structure of safety level, we can give a more 
polymorphic type to append:

   append
     :: AnnotatedString a
     -> AnnotatedString b
     -> AnnotatedString (Join a b)

The implementation can remain unchanged. Now, the programmer needs to 
reason that the following statement is true to justify this type of append:

   (x ++ y) is at least as safe as the least safe of x and y.

So to conclude, statically checked intent-typing can be expressed in 
Haskell using newtype wrappers with phantom types, and possibly some 
type-level computation on these phantom types to teach the compiler 
domain knowledge about the intents.

   Tillmann

PS. Crossposted to the haskell cafe, please consider dropping 
haskell at haskell.org when answering.