[Haskell-cafe] Type system speculation
Andrew Coppin
andrewcoppin at btinternet.com
Wed Dec 9 15:38:14 EST 2009
People in the Haskell community get awfully excited about Haskell's type
system.
When I was first learning Haskell, I found this rather odd. After all, a
"type" is just a flat name that tells the compiler how many bits to
allocate and which operations to allow, right?
As I read further, I quickly discovered that in Haskell, the type system
is something much more powerful. In a normal language, the type system
prevents you from accidentally trying to multiply a string by a binary
tree of integers, or pass a function an signed integer when it's
expecting an unsigned integer. But in Haskell, the combination of
parametric polymorphism, typeclasses and automatic type inference
combine to form a potent mixture. Even without any exotic language
extensions, in Haskell you can use the type system to *prove stuff about
your programs*! And not necessary stuff that's anything to do with the
usual notion of a "data type". You can prove that lists are non-empty or
even of a specific size. You can prove that optional values get checked
before they are used. And you can prove that user input is sanitised
before being passed to sensitive functions. (Or rather, you can make
your program simply not compile if the required properties are not
guaranteed.)
Let's take an example. Suppose you have some functions that obtain input
from a user, and you have some other functions that do something with
this data. What you can do is something like this:
newtype Data x y = Data x
data Checked
data Unchecked
check_string :: Data String Unchecked -> Data String Checked
Now make it so that everything that obtains data from the user returns
Data String Unchecked, and everything that uses a string to do something
potentially hazardous demands Data String Checked. (And any functions
that don't care whether the string is checked or not can just use Data
String x.)
Only one problem: Any functions that expect a plain String now won't
work. Particularly, the entirity of Data.List will refuse to accept this
stuff. Which is kind of galling, considering that a Data String Checked
*is* a String!
What we're really trying to do here is attach additional information to
a value - information which exists only in the type checker's head, but
has no effect on runtime behaviour (other than determining whether we
*get* to runtime). As far as I can tell, Haskell does not provide any
way to take an existing type and attach additional information to it in
such a way that code which cares about the new information can make use
of it, but existing code which doesn't care continues to work. Any
comments on this one?
On a rather unrelated note, I found myself thinking about the Type
Families extension (or whatever the hell it's called). I found that I
can do this:
class Collection c where
type Element c :: *
empty :: c
first :: c -> Maybe (Element c)
...
This works for collections that can contain *anything*:
instance Collection [x] where
type Element [x] = x
...
It works for collections that can only contain *one* type:
instance Collection ByteString where
type Element ByteString = Word8
...
And it works for collections that can handle more than one type, but
less than *every* type:
instance Ord x => Collection (Set x) where
type Element (Set x) = x
...
Something quite interesting is happening here: this "Element" thing is
almost like a *function* which takes one type and returns another. A
function that operates on types themselves.
Now suppose I want to write a function that converts one kind of
collection into another:
convert :: (Collection c1, Collection c2) => c1 -> c2
Erm, no. That doesn't work AT ALL. What we want to say is that the
element types are constrained (specifically, they're identical).
I searched for quite some time looking for a way to say this. In the
end, I gave up and let the compiler deduce the type signature
automatically. Apparently the correct signature is
convert :: (Collection c1, Collection c2, Element c1 ~ Element c2) =>
c1 -> c2
Apparently the bizare syntax "Element c1 ~ Element c2" means that these
two types are supposed to be equal. (Why not just an equals sign?)
This got me thinking... Maybe what we'd really like to say is something like
convert :: (Collection c1 {Element = x}, Collection c2 {Element = x})
=> c1 -> c2
In other words, make the assosiated type like a named field of the
class, using syntax similar to the named fields that values can have.
And then, if we can do it to classes, why not types?
data Map {keys :: *, values :: *} = ...
insert :: (Ord k) => k -> v -> Map {keys = k, values = v} -> Map {keys
= k, values = v}
Now, it seems reasonable that if one of the type fields is
unconstrained, you do not need to mention it:
keys :: (Ord k) => Map {keys = k} -> [k]
values :: Map {values = v} -> [v]
If we can have some mechanism for adding new type fields to an existing
type, we are one step away from the problem at the beginning. You could
take the existing String type, and add (by whatever the mechanism is) a
new "safety" field to it. All of Data.List ignores this field, but any
new code you write can choose to use it if required.
At least, that's my vague idea. Anybody like it?
(My first thought is that "Map {keys = k, values = v}" is going to start
getting tedious to type very, very quickly. Perhaps we should do
something like values :: (m :: Map {values = v}) => m -> [v], or even
allow values :: (m :: Map {}) => m -> [values m] as an alternative? IDK.)
So there you have it. Multiple paragraphs of semi-directed rambling
about type systems. All written by somebody who doesn't even know what a
Skolem variable is. :-}
More information about the Haskell-Cafe
mailing list