[Haskell-cafe] Space leaks
Ben Lippmeier
Ben.Lippmeier at anu.edu.au
Tue Oct 5 22:27:35 EDT 2004
In my experience, any time your program makes use of some state-like
structure which gets updated over a number of iterations, you're going
to be in trouble with space leaks.
There's a library function which summarises this nicely,
mapAccumL :: (state -> x -> (state, y)) -> state -> [x] -> (state, [y])
The GHC library docs describe mapAccumL as
"The mapAccumL function behaves like a combination of map and foldl; it
applies a function to each element of a list, passing an accumulating
parameter from left to right, and returning a final value of this
accumulator together with the new list."
Now imagine that state is some large structure. The [x] list is a couple
of hundred elements long, and you print out some -> but not all <- of
the [y] elements. While the [y] list remains live, a whole collection of
half evaluated intermediate states also remain live - enter the
massive space leak.
Projects I have written which suffered massive space leaks include:
-> A parallel lazy evaluator,
http://cs.anu.edu.au/ample.
The state: the machine state, threads, stacks, the heaps.
The [x] list: machine instructions.
The [y] list: profiling info.
If you don't print out all possible profiling info then all those
intermediate states remain live and you've got a massive space leak.
-> An astroids game I wrote for X.
The state: position, velocities of ship, astroids, missiles.
The [x] list: ship control data, key codes.
The [y] list: a list of graphics prims which get rendered to the screen.
You would think that because the list of prims gets consumed by the
renderer it wouldn't leak.. but it did, about 200k / sec. Then I changed
some seemingly irrelevant part of the code and the space leak went
away.. and I have no idea why. yay.
-> My ICFP contest entry for this year,
Not really a mapAccumL type problem, but still a space leak.
At one stage I had a bug in the parser for my assembly language where it
didn't handle comments properly. One of the entries I ran through my
simulator had comments at the end of lines with Drop statements, but
nowhere else.
The simulator ran fine until an ant found some food, carried it back to
the hill, then attempted to drop it.. parser error. My Haskell program
hadn't fully evaluated the thunks to read / parse / assemble that line
of code until the ant had tried to use that part of the program.. I
laughed, and laughed.. :).
...
I think the only way to totally slay bugs like these is to use some
deepSeq'esque function on all your intermediate states, or any time when
you reckon some evaluation should be 'finished'. Either that or explicly
define intermediate structures to be fully strict.
I see the GHC people have got a seqExpr function in
compiler/coreSyn/CoreSyn.lhs, which I imagine would be applied to the
expression tree after every compiler stage.
Ben.
Graham Klyne wrote:
> I've been starting to take note of discussion about space leaks in
> Haskell. So far, it's not a topic that's bothered me, other than
> obvious programming errors, and I've never found the need to force
> strict evaluation of my Haskell programs. I find myself wondering why
> this is.
>
> Your comment about arithmetic expressions is telling: the kind of
> program I have been writing (parsers, symbolic processing, etc.)
> performs almost no arithmetic. (So far, I've used lists instead of
> arrays, so a usual source of arithmetic functions is missing.)
>
> I've also worked with datasets that fit in memory, so failure to
> "stream" data hasn't been a problem. I suspect that's the more
> pernicious case for space leaks, since the causes aren't always so obvious.
>
> Are there any guidelines or warning signs to look out for that may be
> indicative of potential space-leak problems? So far, I can think of:
> - arithmetic results
> - multiple uses of a large data value
>
More information about the Haskell-Cafe
mailing list