hGetContents and laziness in file io
Thomas Hallgren
hallgren@cse.ogi.edu
Mon, 23 Jul 2001 23:39:17 -0700
Hi,
My guess is that there is a space leak in your program. In both function
convert and parseAll, there are references (the variable ulf) to the
contents of the input file, and they will probably not be released until
the functions return (unless you use a compiler that is clever enough to
delete references after their last use...). There might be sources of
space leaks also in the function parse that is called from parseAll.
If your program only processes one file each time you run it, you could
structure it like this:
main = interact parseAll'
where
parseAll :: String -> String
parseAll' = unlines . map convert' . parse'
parse' :: String -> [Tree]
convert' :: Tree -> String
parse' s =
case parseOneTree s of
Good (tree,rest) -> tree:parse' rest
Error err -> error err
convert' tree = ...
Note that parse' is lazy: it returns the first tree before it tries to
parse the rest of the input.
Anyway, space leaks can be hard to find and eliminate, but there are
tools that can help. The Haskell compiler Nhc98
(http://www.cs.york.ac.uk/fp/nhc98/) tries to generate space efficient
code to begin with, but also provides heap profiling to help you find
out what kind of data is occupying all the space (constructor profile),
which functions produced the data (producer profile) which functions
have references to the data (retainer profile), ...
Hope this helps!
Thomas Hallgren
------------------------------------------------------------------------
Hal Daume wrote:
>... the file that I'm working with is ~20mb of trees. When I
>run my program on this, it is unable to reclaim space (unless i set the
>heap really high). ...
>
>convert inF outF = do inH <- openFile inF ReadMode
> ulf <- hGetContents inH
> outH <- openFile outF WriteMode
> parseAll outH ulf
> hClose inH
> hClose outH
>
>parseAll outH ulf =
> case parse s of
> Good (tree, rest) -> case convert tree of
> Good s' -> do hPutStrLn outFile s'
> Error err -> do putStrLn err
> Error err -> do return ()
>
>
>PLEASE help!
>