IO monad and lazy evaluation
Hal Daume III
hdaume@ISI.EDU
Wed, 21 May 2003 07:25:33 -0700 (PDT)
Hi Graham,
> strict functions in monad-chains". I'm hoping the Haskell community has
> some experience with this kind of issue to offer some more helpful advice,
> or even tools to detect unsafe combinations. Maybe a discussion of safe
> programming patterns would be a useful interim step?
I don't really know what sort of advice is really helpful, but I can share
a few observations:
When choosing between openFile/?/hClose and readFile, use the
openFile/?/hClose combination only if you expect to open a lot of files.
Rationale: readFile supposedly always closes the handle when you're
done but sometimes really just puts it in a semi-closed state. This
means that if you're reading a lot of files, you're going to run
out of handles.
If you need to use openFile/?/hClose because of file handle issues,
read the file (or what parts of it you need) strictly. I actually have
a function in my general library:
readFileCloseBy :: DeepSeq a => FilePath -> (String -> a) -> IO a
which opens the file, parses it using the supplied function, deepSeqs
it to make it strict and then closes the handle. By supplying id as
the function, you get a version of readFile which is strict and
always closes the Handle.
> Thinking some more... I'm reminded of some discussions I had a few years
> ago about the timing of calls to Java finalizers, and problems this could
> cause for network I/O programs because using finalizers to close network
> sockets would lead to unexpected resource problems. The only reliable
This sounds very similar to the semi-closed handle issue in readFile.
> Anyway, my thoughts are leading me to the idea that the problem is a
> disconnect (lack of formal connection or interlock) between the actions of
> opening a file, reading its contents and closing it. For example, one
> could imagine a structure:
>
> hSafeGetContents :: Handle -> (String -> a) -> a
> hSafeGetContents handle function =
> function $ hUnsafeGetContents handle
>
> Now the result string can be as lazy as you like, but I think one can
> guarantee that the handle won't be closed until the function has used as
> much of the content as it may need.
Alas, this is not true :). Let function=id and you'll see the
problem. You need to put a seq or a deepSeq in there somewhere, otherwise
just applying the function won't cause any of the file to be read.
- Hal
> At 13:27 20/05/03 -0700, Hal Daume III wrote:
> >Yes. This is because hGetContents (and hence readFile, etc.) use lazy
> >IO. Just as in this case you might want hClose to force the file to be
> >read, in a case like:
> >
> > > do h <- openFile "really_large_file" ReadMode
> > > c <- hGetContents h >>= return . head
> > > hClose h
> > > return c
> >
> >you probably don't want the close to read the whole file. I'd argue that
> >that problem is not with hClose, but with hGetContents. Really, a strict
> >version should be used in most situations. Something like:
> >
> > > hGetContentsStrict h = do
> > > b <- hIsEOF h
> > > if b then return [] else do
> > > c <- hGetChar h
> > > r <- hGetContentsStrict h
> > > return (c:r)
> >
> >of course, you could be smarter with buffering, etc. Another way would be
> >to do something using seq/deepSeq.
> >
> > - Hal
> >
> >--
> > Hal Daume III | hdaume@isi.edu
> > "Arrest this man, he talks in maths." | www.isi.edu/~hdaume
> >
> >On Tue, 20 May 2003, Graham Klyne wrote:
> >
> > > There seems to be a difficult-to-justify interaction between
> > > lazy evaluation and monadic I/O:
> > >
> > > [[
> > > -- file: SpikeIOMonadCloseHandle.hs
> > > -- Does hClose force completion of lazy I/O?
> > >
> > > import IO
> > >
> > > showFile fnam =
> > > do { fh <- openFile fnam ReadMode
> > > ; fc <- hGetContents fh
> > > ; hClose fh
> > > ; putStr fc
> > > }
> > >
> > > test = showFile "SpikeIOMonadCloseHandle.hs"
> > > ]]
> > >
> > > If I load this into Hugs and run it, the output is a single blank line.
> > >
> > > If I reverse the order of hClose and putStr, the source code is displayed.
> > >
> > > I think I can understand why this is happening, but it seems to me that
> > there's
> > > a violation of referential transparency here: I can't see any reasonable
> > > justification for the value of 'fc' to vary depending on whether it's
> > actually
> > > used before or after some other I/O operation.
> > >
> > > I suppose I was expecting the call of hClose to force complete evaluation
> > > of any value that depends on the state prior to hClose. I've no idea if
> > > there's a reasonable way to implement that.
> > >
> > > My concern is that this weakens the claim for monads that they provide
> > > a seamless integration between pure functional and stateful code; cf.:
> > > [[
> > > We believe that, on the contrary, there are very significant differences
> > > between
> > > writing programs in C and writing in Haskell with monadic state
> > > transformers and
> > > IO:
> > > [...]
> > > - Usually, most of the program is neither stateful nor directly
> > concerned with
> > > IO. The monadic approach allows the graceful coexistence of a small amount
> > > of imperative code and the large purely functional part of the program
> > > [...]
> > > - The usual coroutining behaviour of lazy evaluation, in which the
> > consumer of
> > > a data structure coroutines with its producer, extends to stateful
> > computation
> > > as well. As Hughes argues (Hughes 1989), the ability to separate what is
> > > computed from how much of it is computed is a powerful aid to writing
> > modular
> > > programs
> > > ]]
> > > -- http://research.microsoft.com/Users/simonpj/Papers/state-lasc.ps.gz
> > >
> > > #g
> > >
> > >
> > > -------------------
> > > Graham Klyne
> > > <GK@NineByNine.org>
> > > PGP: 0FAA 69FF C083 000B A2E9 A131 01B9 1C7A DBCA CB5E
> > >
> > > _______________________________________________
> > > Haskell mailing list
> > > Haskell@haskell.org
> > > http://www.haskell.org/mailman/listinfo/haskell
> > >
>
> -------------------
> Graham Klyne
> <GK@NineByNine.org>
> PGP: 0FAA 69FF C083 000B A2E9 A131 01B9 1C7A DBCA CB5E
>