[Haskell-cafe] puzzling memory leak? (GHC)

Young Hyun youngh at sbcglobal.net
Mon Oct 10 01:46:15 EDT 2005


I'm seeing a very puzzling behavior that would be considered a memory  
leak, or unwanted retention of objects, in other languages.  If I've  
made a newbie mistake, could someone point out what I am doing wrong?

The process size (resident size and not just virtual size) of my  
program just grows and grows, passing a few hundred MB's in just 30  
seconds, until I'm forced to kill the process.  I get the same  
problem with GHC 6.4 and 6.4.1 on Solaris 2.8 and MacOS X 10.4.2, and  
GHC 6.2.2 on Solaris 2.8.

Here's what the program does.  It reads a binary file containing a  
large number of varying-length structured records.  Some of these  
records are corrupt (e.g., truncated), and it is the job of my  
program to write out a new file with corrupted records filtered out.   
I decided to take advantage of Haskell's lazy evaluation to separate  
out the task of input parsing from the task of filtering as follows:

import qualified Data.Map as Map

main = do
        contents <- getContents
        fixArts $ parseArts contents [] (Map.singleton "offset" 0)

The invoked functions have the following signatures:

parseArts :: [Char] -> [Char] -> HeaderMap -> [Either ParseError Arts]
fixArts :: [Either ParseError Arts] -> IO ()

where

type HeaderMap = Map.Map String Word

data Arts = Arts { ar_version::Int, ar_flags::Word, ar_data_length::Int,
                    ar_timestamp::Word, ar_src_ip::Word,  
ar_dst_ip::Word,
                    ar_list_id::Word, ar_cycle_id::Word,
                    ar_rtt::Word, ar_rtt_mod::Word,
                    ar_hop_distance::Int, ar_dst_replied::Bool,
                    ar_halt_reason::Int, ar_halt_data::Int,  
ar_reply_ttl::Int,
                    ar_hops::[ArtsHop], ar_offset::Int, ar_bytes:: 
[Char] }

data ArtsHop = ArtsHop { ar_hop_num::Int, ar_hop_addr::Word,
                          ar_irtt::Word, ar_num_tries::Int }

data ParseError = ParseError { pe_message::String, pe_offset::Int,
                                pe_bytes::[Char] }

Here's the body of fixArts:

fixArts ((Left x):xs) =
     do hPutChar stderr '\n'
        hPutStrLn stderr $ "Parse error: " ++ pe_message x
        hPutHexDump stderr bytes offset
        fixArts xs
        where bytes = pe_bytes x
              offset = pe_offset x

{-- XXX normally: do putStr (ar_bytes x); fixArts xs --}
fixArts ((Right x):xs) = fixArts xs

fixArts [] = return ()

--------------------------------

The function hPutHexDump simply prints out a hex dump of a [Char]  
(and commenting out the call makes no difference).

Unless I'm badly mistaken, nothing about fixArts suggests that it  
would leak memory.  So, it would appear parseArts is somehow to blame  
for the memory leak, but this is where it gets weird.  If I just  
slightly change fixArts (see below), then I no longer get a memory  
leak (the process size stays at just over 3MB):

fixArts ((Right x):xs) = do hPutArts stderr x
                             fixArts xs

The function

hPutArts :: Handle -> Arts -> IO ()

simply prints out the fields of the Arts object.  So why should this  
change fix the memory leak if parseArts is truly at fault?

The function parseArts has the following basic form (though the  
actual implementation is much more involved):

parseArts (x:xs) ... = (createArts x) : parseArts xs

So, even without seeing all of the code, does anyone have any clues  
about what may be wrong?

  --Young



More information about the Haskell-Cafe mailing list