[Haskell] Definitions of purity and Lazy IO

Thu Mar 5 23:11:21 EST 2009

As Amr Sabry aptly observed more than a decade ago discussions of
purity and referential transparency usually lead to confusion and
disagreement. His JFP paper provided the needed rigor and argued that
Haskell even _with regular file (stream) IO_ is pure. As was shown
yesterday, with Lazy IO, Haskell is not pure.

Before we discuss definitions, let us note the motivations. Suppose I
have a Haskell program. As every single (compiled) Haskell program it
has the main function, of the type IO a for some a. It must use IO, at
least to print the final result, the observation of the
program. Suppose the program contains the expression "e1 + e2" adding
two integer expressions. Can I replace this expression with "e2 + e1"
and be sure the observations of my program are unaffected?

If one says that ``you should assume that every IO operation returns a
random result''  then we have no ability to reason about any
real Haskell program. We can't be sure of soundness of any compiler
optimization. So, Haskell is far worse than C!? With C, Xavier Leroy
has managed to create a provably correct optimizing compiler, and
mechanically _prove_ the soundness of all optimizations. A C program,
just like a Haskell program, must use IO at least to print the
final result. We can never hope to build a provably correct compiler for
Haskell? We cannot prove observational equivalences of any real
Haskell program? Isn't that sad?

The formal question is thus: are the following two expressions f1 and f2,
of a *pure type* Int->Int->Int 

f1, f2:: Int -> Int -> Int
f1 = \e1 e2 -> case (e1,e2) of
		(1,_) -> e1 - e2
		(_,_) -> e1 - e2

f2 = \e1 e2 -> case (e1,e2) of
		(_,1) -> e1 - e2
		(_,_) -> e1 - e2

equal? Before one invokes an equational theory or says that both these
expressions are just integer subtraction, let me clarify the
question: are f1 and f2 at least weakly observationally equivalent?
That is, for any program context C[] such that C[f1] is well-typed, 
the program C[f2] must too be well-typed, and if one can observe the
result of C[f1] and of C[f2], the two observations must be identical. The
observation of a program may (and often does) involve side-effects and
IO (more on it below). The message posted yesterday exhibited a context C[] that
distinguishes f1 from f2. Thus, in presence of Lazy IO, any equational
theory that equates f1 and f2 cannot be sound. I don't think one can
design such a context C[] using the regular, eager file IO.

Amr Sabry in a paper ``What is a Purely Functional Language?''
J. Functional Programming, 8(1), 1-22, Jan. 1998. 
http://www.cs.indiana.edu/~sabry/papers/purelyFunctional.ps

proposed the definition of purity, see p 2 and Definition 4.7. The
definition essentially states that call-by-name, call-by-value and
call-by-need evaluation functions must be weakly equivalent. These
evaluation functions map programs to observables. The part of
evaluation dealing with observing the answers may have side effects!
Sec 5.1 rigorously shows that lambda-calculus with global state
and destructive mutation may be called pure functional, if
effects are regarded as values and the program is written in what we
now call a monadic style (whose idea was proposed by Reynolds back in
1981 when studying Idealized Algol). The end of Sec 5.1 remarks that
IO can be treated that way -- and in fact, Appendix D of Haskell
report 1.2 indeed had described such a semantics of IO:
	http://haskell.org/definition/haskell-report-1.2.ps.gz
(the appendix starts on p. 139). Thanks to Paul Hudak for pointing this
out three years ago.

Thus Haskell even with IO may be called pure functional. With Lazy IO,
it can no longer be called pure functional as different orders of
evaluation of arguments of a function lead to different program
observations.