[Haskell-cafe] A backhanded compliment and a dilemma

Thu Oct 20 09:48:11 UTC 2016

I suspect that there are few on this list who actually participate in
significant standards bodies, so it is perhaps understandable that the
replies so far tend towards "everyone uses language $FOO even though it
sucks" and "people who choose the implementation language for a standard
don't code it".

I participate in standards development across multiple bodies, and (at
least arguably) in an area where formal methods would be particularly
helpful: standards focussed around device security
<https://www.globalplatform.org/specificationsdevice.asp>. I hope my
perspective will help.

The first thing to understand is that most standards bodies have
participation from a wide range of individuals and organisations. There is
no standardised recruiting process or essential body of knowledge, although
in practice the contributors to most standards at the working group level
have a generally acknowledged level of expertise in the subject under
standardisation.

A huge part of the expectation of a member of a working group is that they
are competent to review and comment intelligently on proposed changes and
additions to the standard in question. This implies that the working group
needs to use language (in the widest sense) that is accessible to all of
its members. To be specific to my own standards participation, I work on a
set of standards around the "Trusted Execution Environment". This is
essentially a small security-oriented runtime designed to support
cryptographic operations. It offers a standard API to developers of
applications which use the environment.

The system is described mainly in terms of the behaviour of its APIs, along
with normative English language text describing those things that are not
straightforwardly expressed by a function signature - I think this is a
fairly common approach in standardisation, and as stated, essentially
relies for its correctness on heroic efforts of checking text. The fact
that we occasionally find "obvious" errors in text written some years ago
is a testament to the fact that this process doesn't work very well, and I
would suggest that thesis actually well understood in the standards world.

However, and I say this with sadness, the availability of tooling to
adequately support the development of machine checked specifications by
traditional standards bodies is just not there. IN the case of the Trusted
Execution Environment, which is one I know, the closest equivalent for
which I am aware of a formal proof is SEL4 (you could certainly build a
Trusted Execution Environment on top of SEL4, for example).

Here's the problem: I am, I believe, a rare example of someone who knows
(some) Haskell - hobbyist for over 10 years - and understand the
implementation domain of TEE well. Frankly, the SEL4 proofs
<https://github.com/seL4/l4v> are beyond my ability to understand, and that
means that it is beyond my ability to determine whether they really
represent an adequate formal proof of the correctness of SEL4 (other than
the fact that I know who did the work, of course, and as such I actually do
trust it).

I would posit that there is no-one else in the groups within which I work
who has any more than superficial knowledge or understanding of Haskell,
proof assistants and other important concepts. In effect, this means that
the expert group within which I work would be unlikely to be competent to
create a machine checked specification because even if, say, I were to do
so, others would be unable to review this. Since it seems unlikely (even if
it were desirable) that standards bodies work will be performed only by
people with a PhD in type theory, the reality is that the mechanisms that
exist today are inaccessible to us.

Now, I want to be clear that I believe that the work done by e.g. the SEL4
team and others is hugely important in getting us to a place where this
situation will change, since we are gradually building a set of "baseline"
proofs around common behaviours, but it is important to consider that:

   - The proofs themselves contain little documentation
   <https://github.com/seL4/l4v/blob/master/proof/invariant-abstract/ARM/ArchTcb_AI.thy>
   to explain how they fit together, which makes them very inaccessible for
   those wishing to learn from or reuse them. The linked proof is for
   operation of the ARM architecture TCB - this is something I understand
   quite well in "everyday life" terms, but it is completely opaque to me as a
   proof.
   - The sub-proofs are not structured in a way which fosters reuse. There
   are multiple proof tools, each with their own quirks and language, and for
   which it appears that porting proofs between them is tricky.
   - Proof tools themselves are not user friendly.
   - The proofs are too far divorced from real implementation to make them
   helpful to implementers (who are usually the end customers of a
   specification).

For what it's worth, some of us (like myself) do create our own
Haskell/Ocaml/whatever models of behaviour for proposed new features, but
we use these to inform our checking process, rather than as input to the
standards process.

In practice, many standards bodies are addressing the inadequacies of the
"hand crafted, hand checked" specification model by attempting to create
(usually Open Source) reference implementations that serve as a normative
specification of system behaviour. For a number of reasons, I am not
convinced that this will be much more successful than the current approach,
but I do believe that it is indicative that the standards field is aware of
the issues it faces with increasingly complex specifications, and is
looking for solutions.

The challenge for the academic community is to demonstrate that there is a
better way which is reasonably accessible to practitioners. I would suggest
that a fairly good starting point of practical use (as opposed to academic
interest) is a well documented machine checked specification for the C
programming language *with reasonably user-friendly tools to accompany
it* (this
part could be the commercial spin-off), as this is, in practice, what many
programming languages and API specifications are built upon - at least
until Rust proves itself comprehensively superior for the task.

Best regards
Jeremy

On 20 October 2016 at 01:57, Richard A. O'Keefe <ok at cs.otago.ac.nz> wrote:

> TL;DR - Haskell mistaken for pseudo-code, a case study on machine-
> checked specification for use in standards can't or won't be read
> by the people who need it (who aren't the people in this mailing list).
>
> A couple of years ago, while trying to implement a programming language
> with >30 years of use and an actual ANSI standard, I encountered a gap
> in the standard where an aspect of arithmetic was referred to the
> Language Independent Arithmetic standard, which had in fact nothing
> whatsoever to say on the topic.  In consequence of this gap, existing
> implementations of this language implement that feature with different
> semantics.  Poking around, I found a smaller but similar hole in SQL,
> and similar issues in other languages.
>
> There was no existing specification that any of these could refer to.
> So I set out to write one.  Having seen other problems in standards
> caused by definitions that had not been adequately proof-read,
> I decided that I wanted a specification that had
>  - been type-checked
>  - had been tested reasonably thoroughly
>
> Since I'm writing in this mailing list, you can guess what I thought
> was a good way to do this: I wrote the specification in quite direct
> Haskell, putting effort into clarity at the expense of efficiency,
> and I used QuickCheck to test the specification.  I still don't know
> whether to be pleased that QuickCheck found mistakes -- demonstrating
> my point that specifications need to be checked thoroughly -- or
> ashamed that I'm still making such mistakes.
>
> My problem:  I can't get this published.
>
> The backhanded compliment:  the last reviewer excoriated me
> for having too much pseudocode in my paper.  (Despite the paper
> stating explicitly that ALL code in the paper was real code that
> had been executed.)  You got it:  Haskell doesn't look like a "real"
> programming language, but like something written for human
> comprehension during design.
>
> The dilemma: what I want to do is to tell people working
> on standards that we NEED to have machine-checked specifications
> and that we HAVE the technology to write such specifications and
> test them (oh and by the way here's this specification I wrote to
> fill that gap).  But people who read Haskell well enough to read
> my specification don't need to be persuaded of this, and in all
> honesty, could write the specification for themselves if it
> occurred to them.  Yet the people who do need to be told that there
> is a much better way to write standards than say the ECMAScript way
> don't read Haskell, and won't be interested in learning to do so
> until they've been persuaded...
>
> So where would _you_ send a case study on machine-checked specification?
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/haskell-cafe/attachments/20161020/8d67dd03/attachment.html>