[Haskell-cafe] Need some advice around lazy IO
Dan Doel
dan.doel at gmail.com
Sun Mar 17 18:15:17 CET 2013
One thing that typically isn't mentioned in these situations is that
you can add more laziness. I'm unsure if it would work from just your
snippet, but it might.
The core problem is that something like:
mapM readFile names
will open all the files at once. Applying any processing to the file
contents is irrelevant unless the results of that processing is
evaluated sufficiently to allow the file to be closed.
Now, most people will tell you that this means lazy I/O is evil, and
you should make it all strict. But, consider an analogous situation
where instead of opening a file handle, we do something that allocates
a lot of memory, and can only free it after processing. We'd run out
of memory allocating 3,000 * X, but X alone is fine. Then people
usually suggest delaying the allocation until you need it, i.e. lazy
evaluation.
Unfortunately, there's no combinator for this in the standard
libraries, but you can write one:
mapMI :: (a -> IO b) -> [a] -> IO [b]
mapMI _ [] = return []
-- You can play with this case a bit. This will open a file for
the head of the list,
-- and then when each subsequent cons cell is inspected. You could probably
-- interleave 'f x' as well.
mapMI f (x:xs) = do y <- f x ; ys <- unsafeInterleaveIO (mapMI f
xs) ; return (y:ys)
Now, mapMI readFile only opens the handle when you match on the list,
so if you process the list incrementally, it will open the file
handles one-by-one.
As an aside, you should never use hClose when doing lazy I/O. That's
kind of like solving the above, "i've allocated too much memory,"
problem with, "just overwrite some expensive stuff with some other
cheap stuff to free up space."
-- Dan
On Sun, Mar 17, 2013 at 1:08 AM, C K Kashyap <ckkashyap at gmail.com> wrote:
> Hi,
>
> I am working on an automation that periodically fetches bug data from our
> bug tracking system and creates static HTML reports. Things worked fine when
> the bugs were in the order of 200 or so. Now I am trying to run it against
> 3000 bugs and suddenly I see things like - too many open handles, out of
> memory etc ...
>
> Here's the code snippet - http://hpaste.org/84197
>
> It's a small snippet and I've put in the comments stating how I run into
> "out of file handles" or simply file not getting read due to lazy IO.
>
> I realize that putting ($!) using a trial/error approach is going to be
> futile. I'd appreciate some pointers into the tools I could use to get some
> idea of which expressions are building up huge thunks.
>
>
> Regards,
> Kashyap
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
More information about the Haskell-Cafe
mailing list