learning to love laziness

John Hughes rjmh at cs.chalmers.se
Fri Sep 26 12:28:25 EDT 2003


On Fri, 26 Sep 2003, Derek Elkins wrote:

> On Thu, 25 Sep 2003 12:59:37 -0700
> Mark Tullsen <tullsen at galois.com> wrote:
>
> > Haskell has lazy/lifted products and not true products.
>
> Aren't lazy products true products?  What makes something a product is:
> fst (x,y) = x
> snd (x,y) = y
> for all x and y.  This holds with lazy products but not eager ones
> fst (3,_|_) = _|_ /= 3
>

Right, strict languages do not have true products either. A strict
language without side-effects, and in which no expression can loop
infinitely or cause an error (because these are side-effects of a sort),
COULD have true products.

The reason Haskell doesn't is that true categorical products have to
satisfy an additional law,

(fst x, snd x) = x

and Haskell's products don't, since

(_|_, _|_) /= _|_

The key question is: can we write a program which distinguishes these two
values? And we certainly can, for example

	seq (_|_,_|_) 0 =  0
	seq    _|_    0 = _|_

or

	case (_|_,_|_) of (_,_) -> 0	=  0
	case    _|_    of (_,_) -> 0	= _|_

But of course, *fst and snd* can't distinguish the two! So if we only
operated on products using fst and snd, then we could pretend that these
two values *were* equal, even though they're not in the implementation,
and thus say that Haskell had true products.

This leads some people to suggest that the behaviour of seq and pattern
matching be changed, so as not to distinguish these two values either.
A possible change would be to make both seq and case lazy for product
types, so that if x were a pair, then seq x y would NOT force the
evaluation of x, and case x of (a,b) -> e would behave as case x of ~(a,b)
-> e, i.e. x would be demanded only when one of a or b was needed.

This would be a good thing, in that Haskell would have a nicer algebra,
but a bad thing, in that it would conceal a difference which is really
present in the implementation, thus denying the programmer a degree of
control over what the implementation does. Namely, a programmer would no
longer be able to move the evaluation of a pair earlier than its
components are needed, by using a seq or a case (or in any other way).

Unfortunately, that control may sometimes be vitally important. Anyone who
has removed space leaks from Haskell programs knows the importance of
controlling evaluation order, so as, for example to drop pointers to
garbage earlier. Consider an expression such as

	if ...condition containing the last pointer
	      to a very large datastructure...
	then (a,b)
	else (c,d)

where a, b, c and d are perhaps expressions which *build* a very large
data-structure. Evaluating this expression early enables a lot of garbage
to be collected, but evaluating a component early allocates a lot of
memory before it is needed. Thus it's important to be able to force just
the pair structure, without forcing components, and this distinguishes _|_
from (_|_,_|_), however it is done.

Space debugging does, to quite a large extent, involve inserting seq in
judiciously chosen places. These places are hard to predict in advance
(before seeing the heap profile). If Haskell had true products, and it
turned out that the place needing a seq happened to have a pair type, then
that simple fix would not be applicable. The programmer would need to
refactor the program, replacing the pair by another type representing a
lifted product, before the space leak could be fixed. That might require
changing an arbitrarily large part of the program, which is clearly not
desirable.

So there are good arguments both for and against having "true products".
The debate has raged many times during the Haskell design process. In the
end, the argument that Haskell's designers should not make finding and
fixing space leaks any harder than it already is won, and I think that was
the right decision -- but it is a trade-off, and one must recognise that.

Incidentally, exactly the same problem arises for functions: Haskell does
not have "true functions" either, because _|_ and \x -> _|_ are not equal.

John Hughes


More information about the Haskell mailing list