[Haskell-cafe] Enforcing data structures invariant in the type system

Tue Mar 31 13:41:28 UTC 2015

I thought I'd weigh in here, as my dissertation is all about how to encoding invariants into Haskell's type system better. Here are some reactions to this thread:

- In my opinion, Haskell is very capable of expressing invariants in its type system today -- you don't need to go to Idris or Agda to get there. In certain cases, this encoding can be quite natural and unobtrusive, and sometimes easier to work with than in full-on dependently typed languages. The problem is that it's very hard for a beginner in this space to tell which invariants are easy to encode and which are hard. The red-black tree example is a great one for an "easy-to-encode" invariant. The "Hasochism" paper [1] shows another. On the other hand, my two (successful) attempts at "proving" a sorting algorithm correct are very painful (insertion sort [2] and merge sort [3]). I'm sure one component in the difference in level of elegance is that Conor and Sam are better dependently-typed programmers than I am!

[1]: https://personal.cis.strath.ac.uk/conor.mcbride/pub/hasochism.pdf
[2]: https://github.com/goldfirere/singletons/blob/master/tests/compile-and-dump/InsertionSort/InsertionSortImp.hs
[3]: https://github.com/goldfirere/nyc-hug-oct2014/blob/master/OrdList.hs

- The `singletons` package [4], can be helpful in doing dependently typed programming in Haskell, and is cited in references previously mentioned in this thread. But I wouldn't rely on it too much -- once you need lots of singletons (more than, say, for numbers and Bools), your code will get very bogged down. It's either time to come up with a different way of encoding for your invariant, give up on compile-time verification, or switch to Idris.

[4]: http://hackage.haskell.org/package/singletons

- The error messages are verbose. Haskell really needs a more interactive mode for dependently typed programming! I'll answer one oft-answered question here: A "skolem" is just a *bound* type variable. This is different from other type variables GHC uses during type inference, which are free, *unification* type variables. The point of a unification variable is as a placeholder until GHC figures out the right type. A skolem variable is one that is actually bound in the source code. Here's a tiny example:

> data Exists where
>   MkE :: forall a. a -> Exists
> 
> foo :: Exists -> Bool
> foo (MkE x) = not x

This reports an error about a skolem(*) variable a not being Bool. Of course, we can't know, in the body of `foo`, that the type of `x` is `Bool` -- `x` has type `a`, for some unknown `a`. In this case `a` is a skolem.

(*): Sadly, the error message (in 7.8.3) just says "rigid" without "skolem". But the choice of words in the error message is a little capricious, and other similar cases can say "skolem".

- Liquid Haskell goes a very different way than do all of the approaches above. It uses a tool, called an SMT solver, outside of GHC to verify annotations in code. Currently, Liquid Haskell can verify conditions that other approaches can only dream of, including in Agda and Idris. (In particular, LH is very good around integers, something that is otherwise a challenge to reason about in types.) But, LH has two key limitations: 
  1) It reasons only about refinement types, which are restrictions on existing types. Examples are "odd integers", "lists of Bools with either 2 or 3 elements", "integers equal to 42", and "an integer `y` that is bigger than that other integer `x`". Often, this is exactly what you want. But sometimes it's not.
  2) Its power is limited by the solving power of the attached SMT solver. SMT solvers can reason about a lot, but -- of course -- they can't do everything. If the SMT solver can't figure out your code, it's very hard to know what to do to make things work again. However, this rarely comes up in practice.

I, personally, look forward to a future (years away) where we can have full dependent types that also use an SMT solver. This will make the easy things easy, but the hard things possible.

Richard

On Mar 30, 2015, at 3:01 PM, Marcin Mrotek <marcin.jan.mrotek at gmail.com> wrote:

> Hello,
> 
>> I appreciate this suggestion!
>> Agda has been on the todo list for a while. Which do you suggest to start
>> from, choosing between agda and idris? Are there any major differences?
> 
> I admit I only have experience in using Agda, I've name-checked Idris
> after only reading the docs (the last time I tried to install the
> compiler, cabal didn't let mi do it because of package conflits; now I
> tried again and it's doing just fine). Idris is newer and, apparently,
> more oriented towards practical programming than theorem proving,
> that's all I know.
> 
>> I’ve read a paper about liquid haskell some time ago and I find it amazing.
>> What I’d like to ask now is: is liquid haskell strictly more expressive than the
>> haskell type system or is “only” a more convenient way of writing the constraints
>> in a way that can be efficiently solved?
> 
> It depends on what you mean by "Haskell's type system". It's more
> capable than Haskell's type system without extensions, and less than
> full-on dependent typing (delibrately so, because dependent type
> equivalence is undecidable). I don't know where exactly does
> Haskell-with-GHC-extensions type system lie. I think it's the same as,
> let's say, Agda's, only requiring more kludges when moving back and
> forth between type- and value-level, but I'm not sure. I'm just a
> "self"-taught programmer, not a computer scientist ;)
> 
> Best regards,
> Marcin Mrotek
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
>