[Haskell-cafe] Re: Lazy IO and closing of file handles
Donald Bruce Stewart
dons at cse.unsw.edu.au
Wed Mar 14 19:15:43 EDT 2007
> dons at cse.unsw.edu.au (Donald Bruce Stewart) writes:
> > pete-expires-20070513:
> >> When using readFile to process a large number of files, I am exceeding
> >> the resource limits for the maximum number of open file descriptors on
> >> my system. How can I enhance my program to deal with this situation
> >> without making significant changes?
> > Read in data strictly, and there are two obvious ways to do that:
> > -- Via strings:
> > readFileStrict f = do
> > s <- readFile f
> > length s `seq` return s
> > -- Via ByteStrings
> > readFileStrict = Data.ByteString.readFile
> > readFileStrictString = liftM Data.ByteString.unpack Data.ByteString.readFile
> > If you're reading more than say, 100k of data, I'd use strict
> > ByteStrings without hesitation. More than 10M, and I'd use lazy
> > bytestrings.
> Correct me if I'm wrong, but isn't this exactly what I wanted to
> avoid? Reading the entire file into memory? In my previous email, I
> was trying to state that I wanted to lazily read the file because some
> of the files are quite large and there is no reason to read beyond the
> small set of headers. If I read the entire file into memory, this
> design goal is no longer met.
> Nevertheless, I was benchmarking with ByteStrings (both lazy and
> strict), and in both cases, the ByteString versions of readFile yield
> the same error regarding max open files. Incidentally, the lazy
> bytestring version of my program was by far the fastest and used the
> least amount of memory, but it still crapped out regarding max open
> So I'm back to square one. Any other ideas?
Hmm. Ok. So we need to have more hClose's happen somehow. Can you
process files one at a time?
More information about the Haskell-Cafe