[Haskell-cafe] Lazy IO and closing of file handles

Donald Bruce Stewart dons at cse.unsw.edu.au
Thu Mar 15 08:47:41 EDT 2007

> >Not necessarily so, since you are making assumptions about the
> >timeliness of garbage collection. I was similarly sceptical of Claus'
> >suggestion:
> >
> >Claus Reinke:
> >>in order to keep the overall structure, one could move readFile backwards
> >>and parseEmail forwards in the pipeline, until the two meet. then make 
> >>sure
> >>that parseEmail completely constructs the internal representation of each
> >>email, thereby keeping no implicit references to the external 
> >>representation.
> you are quite right to be skeptical!-) indeed, in the latest Handle 
> documentation, we still find the following excuse for GHC:
> http://www.haskell.org/ghc/docs/latest/html/libraries/base/System-IO.html#t%3AHandle
>    GHC note: a Handle will be automatically closed when the garbage 
>    collector detects that it has become unreferenced by the program. 
>    However, relying on this behaviour is not generally recommended: the 
>    garbage collector is unpredictable. If possible, use explicit an 
>    explicit hClose to close Handles when they are no longer required. GHC 
>    does not currently attempt to free up file descriptors when they have 
>    run out, it is your responsibility to ensure that this doesn't happen. 
> this issue has been discussed in the past, and i consider it a bug if the 
> memory
> manager tells me to handle memory myself;-) so i do hope that this 
> infelicity will
> be removed in the future (run out of file descriptors -> run a garbage 
> collection
> and try again, before giving up entirely).
> in fact, my local version had two variants of processFile - the one i 
> posted and
> one with explicit file handle handling (the code was restructured this way 
> exactly
> to hide this implementation decision in a single function). i did test both 
> variants
> on a directory with lots of copies of a few emails (>2000 files), and both 
> worked
> on my system, so i hoped -rather than checked- that the handle collection 
> issue
> had finally been fixed, and made the mistake of removing the more complex
> variant before posting. thanks for pointing out that error - as the 
> documentation
> above demonstrates, it isn't good to rely on assumptions, nor on tests.
> so here is the alternate variant of processFile (for which i imported 
> System.IO):
> >processFile path = do
> >  f <- openFile path ReadMode
> >  text <- hGetContents f
> >  let email = parseEmail text
> >  email `seq` hClose f
> >  return email
> all this hazzle to expose a file handle to call hClose on, just so that the 
> GC does not have to..

Are we at the point that we should consider adding some documentation
how to deal with this issue? And are the recommendations to either use
strict IO (should we have a package for System.IO.Strict??), or via
strictness on the consumer of the data.

-- Don

More information about the Haskell-Cafe mailing list