[Haskell-cafe] Re: Lazy IO and closing of file handles

Dougal Stanton ithika at gmail.com
Wed Mar 14 19:28:19 EDT 2007


Quoth Pete Kazmier, nevermore,
> the same error regarding max open files.  Incidentally, the lazy
> bytestring version of my program was by far the fastest and used the
> least amount of memory, but it still crapped out regarding max open
> files. 

I've tried the approach you appear to be using and it can be tricky
to predict how the laziness will interact with the list of actions.

For example, I tried to download a temporary file, read a bit of data
out of it and then download another one. I thought I would save thinking
and use the same file name for each download: /tmp/feed.xml. What
happened was that it downloaded them all in rapid succession,
over-writing each one with the next and not actually reading the data
until the end. So I ended up parsing N identical copies of the final
file, instead of one of each.

You need to refactor how you map the functions so that fewer whole lists
are passed around. I'd guess that (1) is being executed in its entirety
before being passed to (2), but it's not until (2) that the file data is
actually used.

> main =
>     getArgs                      >>=
>     mapM fileContentsOfDirectory >>=                     -- (1)
>     mapM_ print . threadEmails . map parseEmail . concat -- (2)

This means there are a lot of files sitting open doing nothing. I've had
a lot of success by recreating this as:

> main = 
>      getArgs >>=
>      mapM_ readAndPrint
>   where readAndPrint = fileContentsOfDirectory >>= print -- etc.

It may seem semantically identical but it sometimes makes a difference
when things actually happen.

-- 
Dougal Stanton


More information about the Haskell-Cafe mailing list