[Haskell] Leak in Data.Generics (syb-, GHC 6.10.2)

Paul L ninegua at gmail.com
Sat May 30 01:14:16 EDT 2009

I traced a memory leak to the use of Data.Genrics from the syb-
package in GHC 6.10.2. But I'm not sure if it's syb package or this
version of GHC's problem. Since I don't have a small program to
demonstrate this leak, I'm just going to explain what I did, and
hopefully some other users could help to identify this problem.

I have the following line as part of a CGI program (sample1):

  output $ writeHtmlString' $ wikiLinksTransform $ readMarkdown' $ ...

where output is from Network.CGI, and writeHtmlString' and
readMarkdown' are variations of functions from Pandoc.
wikiLinksTransfrom is a generic tranversal (modified from a similar
function in Gitit) using Data.Generics over the Pandoc data structure.

I compile my program in GHC 6.10.2 with -prof, run it with +RTS -hc
-RTS, and plot the memory usage over a certain length of time. I
noticed the memory for Text.Pandoc.Definition.CAF keeps increasing,
which is wrong because my program is never holding the result of the
above output. After some lengthy debugging, I came to the above line
and removed wikiLinksTransform (sample2):

  output $ writeHtmlString' $ readMarkdown' $ ...

The I profiled the memory again, there is no more leak!

To further investigate the problem, I compiled my program with GHC
6.8.3, and this time even sample1 had no leak. Note that GHC 6.8.3's
Data.Generics was from base-3.0 package, not syb. So I'm not sure if
it's GHC's problem or syb's.

I'm almost certain that Gitit users would have met similar problems,
but I'm just too much overloaded to come up with a shorter code to
file a proper bug report. So I'm posting here and hopefully somebody
could follow up and eventually have this bug fixed.

Paul Liu

Yale Haskell Group

More information about the Haskell mailing list