[Haskell-cafe] Lazy IO and closing of file handles

Matthew Brecknell haskell at brecknell.org
Wed Mar 14 20:08:17 EDT 2007


Pete Kazmier:
> When using readFile to process a large number of files, I am exceeding
> the resource limits for the maximum number of open file descriptors on
> my system.  How can I enhance my program to deal with this situation
> without making significant changes?

AFAIU, file handles opened by readFile are closed in the following
circumstances:

1) When lazy evaluation of the returned contents reaches the end of the
file.

2) When the garbage collector runs the finaliser for the file structure.
Obviously, for this to happen, the file structure must be unreachable.
Unfortunately, the unreachability of the file structure doesn't
guarantee anything about the timeliness of the garbage collection. While
the garbage collector does respond to memory utilisation pressure, it
doesn't respond to file handle utilisation pressure.

Consequently, any program which uses readFile to read small portions of
many files is likely to exhibit the problem you are experiencing. I'm
not aware of an easy fix.

You could use openFile, hGetContents and hClose, but then you have to be
careful to avoid another problem, as described in [1]. In [2], Oleg
describes the deeper problems with getContents and friends (including
readFile), and advocates explicitly sequenced I/O. I have a feeling
there have been even more discussions around this topic recently, but
they elude me at the moment.

Of course, we'll be most curious to hear which solution you choose.

[1]http://www.haskell.org/pipermail/haskell-cafe/2007-March/023189.html
[2]http://www.haskell.org/pipermail/haskell-cafe/2007-March/023073.html



More information about the Haskell-Cafe mailing list