[Haskell-cafe] Space and time leaks
Ronald Guida
ronguida at mindspring.com
Wed Oct 3 23:47:27 EDT 2007
I need some help with space and time leaks.
I know of two types of space leak. The first type of leak occurs when
a function uses unnecessary stack or heap space.
GHCi> sum [1..10^6]
*** Exception: stack overflow
Apparently, the default definition for "sum" has a space leak.
I can define my own sum in terms of strict foldl ...
> sum' xs = foldl' (+) 0 xs
... and it doesn't overflow the stack.
GHCi> sum' [1..10^6]
500000500000
(0.27 secs, 112403416 bytes)
GHCi> sum' [1..10^7]
50000005000000
(2.73 secs, 1161223384 bytes)
GHCi> sum' [1..10^8]
5000000050000000
(27.83 secs, 11645261144 bytes)
I think there's still a space leak; I don't understand why GHCi using
10^8, 10^9, 10^10 bytes of memory for these calculations.
The other type of space leak I know of is when I have a chunk of data
in memory that I no longer need, such that the data is still being
referenced somewhere. Since, in theory, I might still access the
data, the garbage collector can't reclaim the memory. I'm not sure
how to construct an example though.
Regarding time leaks, I only know of one kind of leak. If I have a
calculation that accumulates data over time, and I don't ask for any
results until the end, then, due to laziness, that calculation might
accumulate a chain of unevaluated thunks. When I get the the end and
"demand" the final result, I have to wait for it while the RTS
evaluates a long chain of thunks.
The chain of unevaluated thunks is a space leak. The time leak occurs
because the capture process and the accumulate process are supposed to
be interleaved, such that I perform some computations after I capture
each piece of data. If I have to wait around at the end, then it
means the capture process and the accumulate process happened in
sequence.
As far as I know, every time leak has a companion space leak; it's not
possible to create a time leak without a space leak to go with it. Is
this really true?
Now for the hard questions.
1. How do I go about detecting space and time leaks?
2. Once I find a leak, how do I fix it?
3. Are there any programming techniques I can use to avoid leaks?
-- Ron
More information about the Haskell-Cafe
mailing list