[Haskell-cafe] How can we detect and fix memory leak due to lazyness?

Chris Kuklewicz haskell at list.mightyreason.com
Tue Aug 8 04:52:12 EDT 2006


Ahn, Ki Yung wrote:
> Recently, I'm facing the dark side of laziness
> -- the memory leak because of laziness.
> 
> Typical pattern that I encounter the problem is like this.
> 
> My code was working fine and I was happy.
> I just wanted to inspect some properties of my code so
> I made a slight chage go the code such as adding counter
> argument or attaching auxiliary data filed to original data for
> tracing how the data has been constructed.
> All of a sudden the program runs out of memory or overflows
> the stack.
> 
> One problem is that it comes up unexpectedly. Another even
> worse problem is that sometimes I get no idea for the exact
> location causing the leak!
> 
> It really panics facing such darkness of lazy evaluation.
> Just a small innocent looking fix for inspection or tracing
> blow things up, sometime with no clue for its reason.
> 
> When we implement a debugging or tracing option in the
> software and let the user toggle those features, how could
> we be sure that turning on those features won't crash the
> software written in Haskell?
> 
> Are there standardized approaches for detecting and fixing
> these kind of problems?
> 
> Haskell may be type safe but not safe at all from unexpanded
> diversion, which is not because of the programmers' mistake
> but just because of the laziness.
> 
> 
> I have posted an wiki article including one example of adding
> a counter to count the number of basic operations in sorting algorithm.
> 
> http://www.haskell.org/haskellwiki/Physical_equality
> 
> This was a rather simple situation and we figured out how to
> cure this by self equality check ( x==x ) forcing evaluation.
> 
> 
> 
> There are worse cases not being able to figure out the cure.
> I wrote a function for analyzing some property of a graph,
> which worked fine.
> 
> fixOnBy t p f x = if t x' `p` t x then x else fixOnBy t p f x' where x' 
> = f x
> 
> fixSize f x = fixOnBy Set.size (==) f x
> 
> sctAnal gs = null cgs || all (not . null) dcs
> where
>   gs' = fixSize compose $ Set.fromList [(x,y,cs) | To _ x y 
> cs<-Set.toList gs]
>   cgs = [z | z@(x,y,cs)<-Set.toList gs', x==y]
>   dcs = [ [c| c@(a,D,b)<-Set.toList cs , a==b] | (_,_,cs)<-cgs]
>   compose gs = trace ("## "++show (Set.size gs)) $ foldr Set.insert gs $ do
>     (x1,y1,cs1) <- Set.toList gs
>     (_,y2,cs2)  <-  takeWhileFst y1 $ Set.toList $ setGT
> (y1,Al""(-1),Set.empty) gs
>     return (x1,y2,cs1 `comp` cs2)
>   takeWhileFst y = takeWhile (\(y',_,_) -> y==y')
> 
> This function makes a transitive closure of the given set of relations
> by fixpoint iteration on the size of the set of weighted edges.
> 
> Sample output is like this.
> 
> *Main> main
> ## 170
> ## 400
> ## 1167
> ## 2249
> ## 2314
> False
> 
> 
> When I add an extra data field for tracing how the new relation was
> constructed, (e.g. tag [a,b,c] on a->c if it came from a->b and b->c)
> it suddenly overflows the stack even before printing out the trace.

I find that overflow a bit odd.  What is the ghc command line?  Are you using 
optimization flags?

> The following is the code that leaks memory.
> 
> sctAnal gs = null cgs || all (not . null) dcs
> where
>   gs' = fixSize compose $ Set.fromList [TT (x,y,cs) [] | To _ x y
> cs<-Set.toList gs]
>   cgs = [z | z@(TT (x,y,cs) _)<-Set.toList gs', x==y]
>   dcs = [[c| c@(a,D,b)<-Set.toList cs , a==b] | TT (_,_,cs) _<-cgs]
>   compose gs = trace ("## "++show (Set.size gs)) $ foldr checkInsert gs 
> $ do
>     TT (x1,y1,cs1) l1 <- Set.toList gs
>     TT (_,y2,cs2) l2 <- takeWhileTTfrom y1 . Set.toList $ setGT (TT
> (y1,Al""(-1),Set.empty) []) gs
>     return $ TT (x1,y2,cs1 `comp` cs2) (l1++y1:l2)
>   takeWhileTTfrom y = takeWhile (\(TT (y',_,_) _) -> y==y')
>   checkInsert x s
>                   | Set.member x s = s
>                   | otherwise      = Set.insert x s
> 
> data TT a b = TT a b deriving (Show)
> instance (Eq a, Eq b) => Eq (TT a b) where
>  (TT x lx) == (TT y ly) = lx==lx && ly==ly && x == y
> instance (Ord a, Ord b) => Ord (TT a b) where
>  (TT x lx) < (TT y ly) = lx==lx && ly==ly && x < y
> 

Tracing by eye:

sctAnal gc => null cgs => Set.toList gs' =>

let long = (Set.fromList [TT (x,y,cs) [] | To _ x y cs<-Set.toList gs])
in fixOnBy Set.size (==) compose (long) =>

if (Set.size (compose long)) == (Set.size long) then long else (compose long) =>

Set.size (compose long) => compose long =>

trace ("##"++show (Set.size long)) <rest> => Set.size long => long =>

Set.fromList [TT (x,y,cs) [] | To _ x y cs<-Set.toList gs] => Set.toList gs

Which does not look like it will blow stack space.  So I cannot see why the 
tracing function does not get to print the size.

I would try to simplify the string the trace function prints into a literal 
instead of a calculation on the size.  Then I would add many many more (trace 
"literal" $) functions to the code until I get some that print before it 
crashes.  But I suspect you have done most of that.


More information about the Haskell-Cafe mailing list