[Haskell-beginners] space leak processing multiple compressed files
Ian Knopke
ian.knopke at gmail.com
Tue Sep 4 12:00:48 CEST 2012
Hi everyone,
I have a collection of bzipped files. Each file has a different number
of items per line, with a separator between them. What I want to do is
count the items in each file. I'm trying to read the files lazily but
I seem to be running out of memory. I'm assuming I'm holding onto
resources longer than I need to. Does anyone have any advice on how to
improve this?
Here's the basic program, slightly sanitized:
main = do
-- get a list of file names
filelist <- getFileList "testsetdir"
-- process each compressed file
files <- mapM (\x -> do
thisfile <- B.readFile x
return (Z.decompress thisfile)
) filelist
display $ processEntries files
putStrLn "finished"
-- processEntries
-- processEntries is defined elsewhere, but basically does some string
processing per line,
-- counts the number of resulting elements and sums them per file
processEntries :: [B.ByteString] -> Int
processEntries xs = foldl' (\x y -> x + processEntries (B.lines y)) 0 xs
-- display a field that returns a number
display :: Int -> IO ()
display = putStrLn . show
More information about the Beginners
mailing list