[Haskell-cafe] foldl and space problems

Bernard Pope bjpop at cs.mu.OZ.AU
Mon Jun 6 23:17:22 EDT 2005


On Mon, 2005-06-06 at 13:15 +0200, Gracjan Polak wrote:
> Hello,
> 
> My space problems continued...
> 
> I have foldl that produces list, some combining function and quite large 
> source list:
> 
> let xyz = foldl f state myBigList
> 
> This setting should lazyli consume myBigList when next elements of xyz 
> are demanded. Except that it seems that myBigList is held by state to 
> the end of computation :(
> 
> Question: is there any way to see what is holding my source list? I did 
> try to guess, but without results as of now:(

foldl suffers from a fairly notorious space leak when used under lazy
evaluation.

Here is foldl:

   foldl f acc [] = acc
   foldl f acc (x:xs)
      = foldl f (f acc x) xs

Here is a "symbolic" computation using it:

foo = foldl g init [a,b,c]
    = foldl g (g init a) [b,c]
    = foldl g (g (g init a) b) [c]
    = foldl g (g (g (g init a) b) c) []
    = (g (g (g init a) b) c)

Notice that the "accumulator" argument grows with size proportional to
the amount of list consumed.

I would guess that your program is suffering from this problem.

The solution?

One theoretical solution is to avoid lazy evaluation in the language
implementation. For instance an "optimistic" evaluator might avoid the
space leak. GHC has an experimental branch that supports this, but as
far as I know it has not seen an official release.

A more practical solution is to force the compiler to generate more
strict code. 

Data.List provides a function called foldl' which has the same type as
foldl, but has different strictness. In particular it forces the
accumulator argument to be "evaluated" before each recursive call to
foldl. 

Unfortunately foldl' is not always as strict as you want, because it
only forces the accumulator to be evaluated to what is called Weak Head
Normal Form. If your accumulated value (state) has a lazy data
constructor, such as the tuple constructor, you might find that the
space usage remains very high. Exercise for the reader: why is this so?

The solution in that case might be to add strictness flags to the
arguments of the state constructors, though this may have adverse
effects elsewhere in the program.

> How do I debug and/or reason about such situation?

Very good question. One solution is to practice your term re-writing
skills and try to reason about the size of the intermediate terms that
are generated.

You might also find GHood useful:

http://www.cs.kent.ac.uk/people/staff/cr3/toolbox/haskell/GHood/

Cheers,
Bernie.



More information about the Haskell-Cafe mailing list