[Haskell-cafe] A question about constraints

Fri Oct 3 02:11:52 EDT 2008

On Thu, Oct 2, 2008 at 7:51 AM, jean-christophe mincke
<jeanchristophe.mincke at gmail.com> wrote:
> Hello,
>
> Thank you for your comments.
>
> Would not it be feasible to add constraints at type definition, something
> like, in a somewhat free style syntax
>
> data String2 = String2 (s::String)  with length s <= 5
>
> and with a polymorphic type
>
> data List5 a= List5 (l::[a]) with length l <= 5

Well, yeah, it is possible that to a language.  However, it's a
question of how far you take it.  What do you want that to do?  Is it
a runtime check on the constructor?  Is it a compile-time guarantee?
If it's runtime, how lazy is it -- i.e. when does it check?  If it's
compile-time, how do you enforce it?

Basically it dumps the contents of pandora's box all over the design
space, so it's easier to leave it out and just let module abstraction
take care of the hard questions.  There are definitely ways to
simulate it:

You can simulate the runtime checks as follows:

  subtype :: (a -> Bool) -> (a -> b) -> a -> b
  subtype p f x = if p x then f x else error "Subtype constraint failed"

  string2 :: String -> String2
  string2 = subtype (\s -> length s < 5) String2
  -- And don't expose String2 from the module, so string2 is the only
way to make them

But to give an example of why this is not a straightforward thing to
answer, here's a different way which might also be correct, depending
on what you want:

  string2 :: String -> String2
  string2 s = partial 5 s
    where
    partial 0 _ = error "Subtype constraint failed"
    partial n [] = []
    partial n (x:xs) = x:partial (n-1) xs

Which lazily checks the constraint; i.e. only errors if a value is
demanded beyond the index 5 and exists.

Implementing Compile-time checks is typically much harder, and demands
a bit more cleverness, since the way you do it is different for each
type of constraint you have.  For this example, you could create an
algebra of lengthed lists:

  import Prelude hiding (++)

  -- Type level numbers; Z = 0, S n = n + 1.
  data Z
  data S n

  -- Lists of length exactly n
  data Listn n a where
    Nil :: Listn Z a
    Cons :: a -> Listn n a -> Listn (S n) a

  -- Type-level addition of numbers
  type family Plus m n :: *
  type instance Plus Z n = n
  type instance Plus (S m) n = S (Plus m n)

  -- Typed append; appends the lists, adds the lengths.
  -- The compiler verifies that the implementation actually does this!
  (++) :: Listn m a -> Listn n a -> Listn (Plus m n) a
  Nil ++ ys = ys
  Cons x xs ++ ys = Cons x (xs ++ ys)

  -- Type-level less than or equal; represents the type of *proofs* that m <= n.
  data m :<= n where
    LeN :: n :<= n
    LeS :: m :<= n -> m :<= S n

  -- A List5 a is a function which takes a proof than n <= 5, and
returns a Listn n a
  type List5 a = forall n. n :<= S (S (S (S (S Z)))) -> Listn n a

So that was a bit of work, wasn't it?  But this solution is quite
expressive; the compiler will not even compile your code if you try to
construct a list with more than 5 elements.  Furthermore, you can
actually write most things you'd expect to because of Haskell's great
GADT typechecking features :-)

The downside is that you have to redefine all the standard list
operations, because, well, they have different types now.

The above is approaching what dependently typed languages do.  My
favorite is Coq (a lot of haskell folks like Agda 2), and in it you
can express directly the constraint you want without all this type
boilerplate:

  Definition List5 a := { l : List a | length l <= 5 }.

But internally it is doing something very similar to the above Haskell
program.  This also has the property that it is impossible to write a
well-typed program that constructs a List5 of length > 5.  In
addition, you can use all the standard list functions, however you
have to prove what they do to the constraint:

  Theorem append_length : forall m n a (xs ys : List a), (length xs <=
m) -> (length ys <= n) -> (length (xs ++ ys) <= m+n).
    (* prove prove prove.... *)
  Qed.

And then you can use that theorem to prove that the parts of your
program that use List5 are well-typed.

Whew, so now that we're done with that, in summary, it depends on your
situation how you want to do it, and there's really no "easy answer".
I hope you got something out of seeing these techniques.

Adding notation like that you suggest is a challenge for the language,
since a runtime constraint changes the strictness properties of the
object, and a compile-time constraint written that concisely actually
does require full dependent types.  Best to leave it to the users.

Luke