[Haskell-cafe] puzzling memory leak? (GHC)
youngh at sbcglobal.net
Tue Oct 11 02:34:12 EDT 2005
On Oct 9, 2005, at 11:32 PM, Ben Lippmeier wrote:
> This sounds like a text-book space-leak caused by lazy evaluation.
> In a lazy language, function evaluation is driven by the need to
> perform IO. Because the first version of your program never prints
> the parsed structure, the parser doesn't completely parse it. This
> implies that the system needs to hang onto all the input data (as
> well as all the partially evaluated calls to the parser) incase it
> needs it later on.
I naively believed (or hoped) that, so long as I the programmer took
some care, the evaluator would be smart enough to discard unaccessed
computations and values and thus avoid a space leak. Hence the thing
I worried about was explicit references to data structures (say, the
head of very long lists) that prevent objects from being reclaimed.
Now it seems I need to also worry about the hidden aspects of the
internal implementation of lazy evaluation.
What intuition should be my guide about the "proper way" of employing
lazy evaluation? It's not yet clear to me what you mean by "the
system needs to hang onto all the input data ...". It seems
counterintuitive for the evaluator to hang onto partially evaluated
functions and input data when it can be proven (through static
analysis) that references to that data can no longer exist in the
remainder of the program execution; for example, consider
case head $ drop 500000 $ parseArts ... of
Right x -> hPutArts stderr x
in which the first 500,000 elements in the result of parseArts can't
ever be referenced after 'drop' executes, and yet the space leak
still happens (presumably while trying to skip over those 500,000
elements). Have I understood you correctly that this is the
unfortunate behavior of lazy evaluation?
> The default string representation is Haskell is pretty abysmal, and
> having it use hundreds of megs to represent, say a 10 meg file is
> not too surprising.
Each file I process is 70MB, so inefficient representation could be a
problem if input data is buffered.
> My advice is that if you don't want to fool around with it, just
> swallow hard, then change fixArts to do a hPutArts to /dev/null.
> Either that or
> 1) go read about the DeepSeq library.
> 2) add strictness annotations to your structure definition.
Thanks for the advice; I'll look into both.
More information about the Haskell-Cafe