[Haskell-cafe] How is laziness defined?

Mon Feb 5 01:14:52 EST 2007

> I would think that with 100% laziness, nothing would happen until the
> Haskell program needed to output data to, e.g. the console.

In many cases, that's exactly what it's like. 

> Quite obviously that's not it. So how is laziness defined in Haskell?

In fact, Haskell is not defined as lazy, it is defined as non-strict.
>From what I understand, non-strict semantics are (very informally) about
choosing an order of evaluation that, where possible, avoids
non-termination or error. Laziness is about evaluating only what's
needed. In any case, I think all of the mainstream Haskell compilers and
interpreters implement the non-strict semantics by means of lazy
evaluation, so unless you're working on a radical new Haskell
implementation, you probably don't need to worry about the difference.
There's a wiki stub about this, though it doesn't say much more than
what I've just said:

http://haskell.org/haskellwiki/Lazy_vs._non-strict

> I remember vaguely someone saying that pattern matching on a value
> forces it to be evaluated. Is that right? What else?

Yes, but you need to be more precise about when this pattern matching
occurs, what you mean by "it", and the extent to which "it" is
evaluated. See below about "weak head normal form".

> This is one of the things that just boggles my mind everytime I try to
> wrap it around this thing called Haskell ;)

If it's any comfort, you're not alone. It takes some discipline to
forget your previous notions of computation, and to start thinking
lazily.

Get familiar with the notion of the "thunk": a record which the
implementation stores on the heap in place of the result of some
computation which has not yet been performed. A thunk typically records
a reference to a function and some arguments, but remember that the
arguments (and indeed, the function) in all likelihood haven't been
evaluated yet, so they might well be thunks too. A thunk sits on the
heap until either the garbage collector decides it's no longer needed
(in which case the computation is never performed), or until the
implementation finds it needs the value (or part of it).

Also get familiar with "weak head normal form" (WHNF). I'm not 100% on
this, so hopefully someone will correct me if I'm wrong. Let's say the
implementation gets to a point where it needs to perform a pattern match
on a value. So far, the value is just an unevaluated thunk on the heap.
To perform the pattern match, the implementation needs to evaluate that
thunk to its top-level data constructor. It doesn't need to go any
further than that just yet (unless the pattern also includes a
sub-pattern inside the top-level data constructor). So a value that is
evaluated to its top-level data constructor is said to be in "weak head
normal form". The result of the pattern match (aside from allowing the
implementation to choose between alternative patterns based on the data
constructor) is typically that one or more variables are bound to values
inside the data constructor. Of course, those values are most likely
unevaluated thunks, at least until further pattern matching is
necessary.

Hopefully, that's clear enough to be of some use.