[Haskell-cafe] Why isn't "Program Derivation" a first class citizen?

Wed Feb 13 09:40:30 CET 2013

On Tue, Feb 12, 2013 at 4:47 PM, Nehal Patel <nehal.alum at gmail.com> wrote:
>
> A few months ago I took the Haskell plunge, and all goes well...
> ... snip ...
> And so my question is, that in 2013, why isn't this process a full fledged part of the language? I imagine I just don't know what I'm talking about, so correct me if I'm wrong, but this is how I understand the workflow used in practice with program derivation:  1) state the problem pedantically in code, 2) work out a bunch of proofs with pen and paper, 3) and then translate that back into code, perhaps leaving you with function_v1, function_v2, function_v3, etc   -- that seems fine for 1998, but why is it still being done this way?

There is no one-stop shop for this sort of stuff. There are plenty of
ways to show equivalence properties for certain classes of programs.

I am unsure why you believe people tend to use pen and paper -
machines are actually pretty good at this stuff! For example, I have
been using Cryptol lately, which is a functional language from Galois
for cryptography. One feature it has is equivalence checking: you can
ask if two functions are provably equivalent. This is used in the
examples to prove that a naive Rijndael implementation is equivalent
to an optimized AES implementation.

You might be asking why you can't do this for Haskell. Well, you can -
kind of. There is a library called SBV which is very similar in spirit
to Cryptol but expressed as a Haskell DSL (and not really for Crypto
specifically.) All SBV programs are executable as Haskell programs -
but we can also compile them to C, and prove properties about them by
using an SMT solver. This is 'free' and requires no programmer
intervention beyond stating the property. So we can specify faster
implementations, and properties that show they're equivalent. SMT
solvers have a very low cost and are potentially very useful for
certain problems.

So there's a lot of good approaches to tackling these problems in
certain domains, some of them more automated than others. You can't
really provide constructive 'proofs' like you do in Agda. The language
just isn't meant for it, and isn't really expressive enough for it,
even when GHC features are heavily abused.

> What I'm asking about might sound related to theorem provers, -- but if so ,I feel like what I'm thinking about is not so much the very difficult problem of automated proofs or even proof assistants, but the somewhat simpler task of proof verification. Concretely, here's a possibility of how I imagine   the workflow could be:

I believe you are actually precisely talking about theorem provers, or
at least this is the impression I get. The reason for that is the next
part:

> ++ in code, pedantically setup the problem.
> ++ in code, state a theorem, and prove it -- this would require a revision to the language (Haskell 201x) and perhaps look like Isabella's ISAR -- a -structured- proof syntax that is easy for humans to construct and understand -- in particular it would possible to easily reuse previous theorems and leave out various steps.  At compile time, the compiler would check that the proof was correct, or perhaps complain that it can't see how I went from step 3 to step 4, in which case I might have to add in another step (3b) to  help the compiler verify the proof.
> ++ afterwards, I would use my new theorems to create semantically identical variants of my original functions (again this process would be integrated into the language)

Because people who write things in theorem provers reuse things! Yes,
many large programs in Agda and Coq for example use external libraries
of reusable proofs. An example of this is Agda's standard library for
a small case, or CoRN for Coq in the large. And the part about
'leaving things out' and filling them in if needed - well, both of
these provers have had implicit arguments for quite a while. But you
can't really make a proof out of axioms and letting the compiler
figure out everything. It's just not how those tools in particular
work.

I find it very difficult to see the difference between what you are
proposing and what people are doing now - it's often possible to give
proof of something regardless of the underlying object. You may prove
some law holds over a given structure for example, and then show some
other definitive 'thing' is classifed by that structure. An example is
a monoid, which has an associative binary operation. The /fact/ that
operation is associative is a law about a monoids' behavior.

And if you think about it, we have monoids in Haskell. And we expect
that to hold. And then actually, if you think about it, that's pretty
much the purpose of a type class in Haskell: to say some type abides
by a set of laws and properties regardless of its definitive
structure. And it's also like an interface in Java, sort of: things
which implement that basic interface must abide by *some* rules of
nature that make it so.

My point is, the intuition we have about what it means for programs to
be 'equivalent' or have 'structure' is very deep mathematically and
computationally. We use these properties to reason about our programs,
and their behavior and relation to other programs, and perhaps for the
sake of efficiency (but mostly, just for sanity.) So there's naturally
a ridiculously huge amount of work into this field of program
correctness and derivation!

But Haskell isn't a language where you provide proofs, you can only
provide evidence. And monoids aren't very interesting on their own
anyway. So: how do we classify and codify program behavior in ways for
'equivalence'? Are there reasons for this beyond equivalence checking?

I think at heart your question is asking: is there a rigorous notion
in which we can describe properties of Haskell programs, prove them,
and then prove other things for Haskell programs as we know them
today?

I think the answer for right now is, it's limited. And probably not
what you want. There are a lot of approaches. Isabelle can compile
Haskell code into Isabelle for example, so you can prove stuff there.
Zeno is an automated proof system for Haskell 98. There's been
research into Haskell "contracts" for defining compile-time contract
enforcement. You can use SBV to prove things about symbolic,
bit-precise programs, etc. You can just ignore the fact that Haskell
has `undefined' and `seq` and friends and reason/prove things in a
'Total Haskell' subset of the language[1]. You can also provide fairly
rich types in Haskell to say things, but at a certain point it becomes
a bit crazy (Cryptol does things Haskell/SBV can't really do easily or
at all, for example.)

In practice that means all these approaches are not meant for
grand-scale stuff. They could certainly work in the short run, though.

You may want to really give a theorem prover a run if you haven't
already. Some of these include Agda, Coq, Isabelle and more recently
Idris.

Finally, I think Jan Stolarek offers a very interesting perspective:
some compilers and libraries *already* make your programs faster
through proofs! Typical list deforestation through the 'foldr/build'
rule is a more general result about the behavior of catamorphisms. I
wrote about this myself a while ago[2]. So, GHC can use that to
eliminate intermediate data structures - so maybe your slow reference
isn't that slow!

[1] http://www.cse.chalmers.se/~nad/publications/danielsson-et-al-popl2006.html

[2] http://neocontra.blogspot.com/2013/02/controlling-inlining-phases-in-ghc.html

--
Regards,
Austin