[Haskell-cafe] Lazy IO and closing of file handles

Bertram Felgenhauer bertram.felgenhauer at googlemail.com
Thu Mar 15 07:51:56 EDT 2007


On 3/14/07, Pete Kazmier <pete-expires-20070513 at kazmier.com> wrote:
> When using readFile to process a large number of files, I am exceeding
> the resource limits for the maximum number of open file descriptors on
> my system.  How can I enhance my program to deal with this situation
> without making significant changes?

I made it work with 20k files with only minor modifications.

> > type Subject = String
> > data Email   = Email {from :: From, subject :: Subject} deriving Show

It has been pointed out that parseEmail would work better if it were
strict; the easiest way to accomplish this seems to be to replace the
above line by

data Email   = Email {from :: !From, subject :: !Subject} deriving Show

[snip]

> > fileContentsOfDirectory :: FilePath -> IO [String]
> > fileContentsOfDirectory dir =
> >     setCurrentDirectory  dir >>
> >     getDirectoryContents dir >>=
> >     filterM doesFileExist    >>=  -- ignore directories
> >     mapM readFile

And here's another culprit - readFile actually opens the file before
any of its output is used. So I imported  System.IO.Unsafe  and replaced
the last line above by

    mapM (unsafeInterLeaveIO . readFile)

With these two changes the program seems to work fine.

HTH,

Bertram


More information about the Haskell-Cafe mailing list