[Haskell-beginners] laziness and optimization

Brent Yorgey byorgey at seas.upenn.edu
Sat Mar 21 15:30:03 EDT 2009


On Sat, Mar 21, 2009 at 06:02:58AM -0700, Michael Mossey wrote:
>
> So I'm wondering to what extent the haskell compiler recognizes 
> computations it's done before. In a purely functional language this should 
> be pretty easy, right? If it sees the same expression, it knows it will 
> have the same value. That's my understanding, so far.

It really doesn't; this isn't as easy in a purely functional language
as you think.  For one thing, how to recognize which computations have
been done before?  The runtime would have to keep around some sort of
hash mapping expressions to values, and incur a lot of overhead
checking each thing to be evaluated against this map. Furthermore,
this may be very expensive memory-wise.  Consider the the case where
some expression generates a very long (say, million-element) list, but
it is immediately processed by some other code which counts how many
threes the list contains.  This can be done in constant memory, since
the garbage collector can come along behind and clean up the processed
parts of the list even while it is still being generated.  But if we
need to keep the list around just in case the expression that
generated it is used again, we will need to keep a million-element
list in memory!  So sometimes, memoizing expression values like this
is a pessimization, and it's very difficult to tell the difference.

This is a tricky problem; I've run into pretty much the exact same
problem (the need to cache computed sizes for recursively constructed
elements) in my 'diagrams' library.  Probably the easiest solution
(which I haven't actually yet implemented in my library, but plan to
do something along these lines!) is just to cache the size like you
would in an imperative language.  For example:
  
    data Thingy = Composite [Sized Thingy]
                | Primitive String

    data Sized a = S { size :: Int, thing :: a }

    thingySize :: Thingy -> Int
    thingySize (Primitive s) = length s
    thingySize (Composite sts) = sum (map size sts)

    mkSizedThingy :: Thingy -> Sized Thingy
    mkSizedThingy t = S (thingySize t) t

This has exactly the property you want: due to laziness, the size of a
SizedThingy will be computed *only* if it is ever needed; and it will
be computed at most once.

-Brent


More information about the Beginners mailing list