[Haskell-cafe] On the purity of Haskell

Sun Jan 1 22:31:51 CET 2012

On Sun, Jan 1, 2012 at 3:26 PM, Ketil Malde <ketil at malde.org> wrote:
> Chris Smith <cdsmith at gmail.com> writes:
>
>>> I wonder: can writing to memory be called a “computational effect”? If
>>> yes, then every computation is impure.
>
> I wonder if not the important bit is that pure computations are unaffected by
> other computations (and not whether they can affect other computations).
> Many pure computations have side effects (increases temperature,
> modifies hardware registers and memory, etc), but their effect can only
> be observed in IO.
>
> (E.g. Debug.Trace.trace provides a non-IO interface by use of
> unsafePerformIO which is often considered cheating, but in
> this view it really is pure.)

The point of purity (and related concepts) is to have useful tools for
working with and reasoning about your code. Lambda calculi can be seen
as the prototypical functional languages, and the standard ones have
the following nice property:

  The only difference between reduction strategies is termination.

Non-strict strategies will terminate for more expressions than strict
ones, but that is the only difference.

This property is often taken to be the nub of what it means to be a
pure functional language. If the language is an extension of the
lambda calculus that preserves this property, then it is a pure
functional language. Haskell with the 'unsafe' stuff removed is such a
language by this definition, and most GHC additions are too, depending
on how you want to argue. It is even true with respect to the output
of programs, but not when you're using Debug.Trace, because:

    flip (+) ("foo" `trace` 1) ("bar" `trace` 1)

will print different things with different evaluation orders.

A similar property is referential transparency, which allows you to
factor your program however you want without changing its denotation.
So:

    (\x -> x + x) e

is the same as:

    e + e

This actually fails for strict evaluation strategies unless you relax
it to be 'same denotation up to bottoms' or some such. But not having
to worry about whether you're changing the definedness of your
programs by factoring/inlining is actually a useful property that
strict languages lack.

Also, the embedded IO language does not have this property.

    do x <- m ; f x x

is different from

    do x <- m ; y <- m ; f x y

and so on. This is why you shouldn't write your whole program with IO
functions; it lacks nice properties for working with your code. But
the embedded IO language lacking this property should not be confused
with Haskell lacking this property.

Anyhow, here's my point: these properties can be grounded in useful
features of the language. However, for the vast majority of people,
being able to factor their code differently and have it appear exactly
the same to someone with a memory sniffer isn't useful. And unless
you're doing serious crypto or something, there is no correct amount
of heat for a program to produce. So if we're wondering about whether
we should define purity or referential transparency to incorporate
these things, the answer is an easy, "no." We throw out the
possibility that 'e + e' may do more work than '(\x -> x + x) e' for
the same reason (indeed, we often want to factor it so that it
performs better, while still being confident that it is in some sense
the same program, despite the fact that performing better might by
some definitions mean that it isn't the same program).

But the stuff that shows up on stdout/stderr typically is something we
care about, so it's sensible to include that. If you don't care what
happens there, go ahead and use Debug.Trace. It's pure/referentially
transparent modulo stuff you don't care about.

-- Dan