[Haskell-cafe] Space leaks

Ben Lippmeier Ben.Lippmeier at anu.edu.au
Tue Oct 5 22:27:35 EDT 2004


In my experience, any time your program makes use of some state-like 
structure which gets updated over a number of iterations, you're going 
to be in trouble with space leaks.

There's a library function which summarises this nicely,
mapAccumL :: (state -> x -> (state, y)) -> state -> [x] -> (state, [y])

The GHC library docs describe mapAccumL as

"The mapAccumL function behaves like a combination of map and foldl; it 
applies a function to each element of a list, passing an accumulating 
parameter from left to right, and returning a final value of this 
accumulator together with the new list."

Now imagine that state is some large structure. The [x] list is a couple 
of hundred elements long, and you print out some -> but not all <- of 
the [y] elements. While the [y] list remains live, a whole collection of 
  half evaluated intermediate states also remain live -  enter the 
massive space leak.


Projects I have written which suffered massive space leaks include:

-> A parallel lazy evaluator,
http://cs.anu.edu.au/ample.

The state:    the machine state, threads, stacks, the heaps.
The [x] list: machine instructions.
The [y] list: profiling info.

If you don't print out all possible profiling info then all those 
intermediate states remain live and you've got a massive space leak.


-> An astroids game I wrote for X.
The state:    position, velocities of ship, astroids, missiles.
The [x] list: ship control data, key codes.
The [y] list: a list of graphics prims which get rendered to the screen.

You would think that because the list of prims gets consumed by the 
renderer it wouldn't leak.. but it did, about 200k / sec. Then I changed 
some seemingly irrelevant part of the code and the space leak went 
away.. and I have no idea why. yay.


-> My ICFP contest entry for this year,
Not really a mapAccumL type problem, but still a space leak.

At one stage I had a bug in the parser for my assembly language where it 
didn't handle comments properly. One of the entries I ran through my 
simulator had comments at the end of lines with Drop statements, but 
nowhere else.

The simulator ran fine until an ant found some food, carried it back to 
the hill, then attempted to drop it.. parser error. My Haskell program 
hadn't fully evaluated the thunks to read / parse / assemble that line 
of code until the ant had tried to use that part of the program.. I 
laughed, and laughed.. :).

...
I think the only way to totally slay bugs like these is to use some 
deepSeq'esque function on all your intermediate states, or any time when 
you reckon some evaluation should be 'finished'. Either that or explicly 
define intermediate structures to be fully strict.

I see the GHC people have got a seqExpr function in 
compiler/coreSyn/CoreSyn.lhs, which I imagine would be applied to the 
expression tree after every compiler stage.

Ben.




Graham Klyne wrote:
> I've been starting to take note of discussion about space leaks in 
> Haskell.  So far, it's not a topic that's bothered me, other than 
> obvious programming errors, and I've never found the need to force 
> strict evaluation of my Haskell programs.  I find myself wondering why 
> this is.
> 
> Your comment about arithmetic expressions is telling:  the kind of 
> program I have been writing (parsers, symbolic processing, etc.) 
> performs almost no arithmetic.  (So far, I've used lists instead of 
> arrays, so a usual source of arithmetic functions is missing.)
> 
> I've also worked with datasets that fit in memory, so failure to 
> "stream" data hasn't been a problem.  I suspect that's the more 
> pernicious case for space leaks, since the causes aren't always so obvious.
> 
> Are there any guidelines or warning signs to look out for that may be 
> indicative of potential space-leak problems?  So far, I can think of:
> - arithmetic results
> - multiple uses of a large data value
> 



More information about the Haskell-Cafe mailing list