IO monad and lazy evaluation
Graham Klyne
gk@ninebynine.org
Wed, 21 May 2003 09:50:59 +0100
Hal,
I agree with the surface points you make. It's easy enough to fix the
problem once you realize it's there. (In my own "real" program, I moved
the hClose, which meant I had to pass the handle out of the function which
opened it.)
The underlying thrust of my post was that I thought pure functional
languages, like Haskell, were supposed to help one avoid such traps by
ensuring the messy dependencies on ordering didn't arise in the first
place. As things stand, I don't think I could begin articulate a reliable
set of rules for avoiding such problems, short of something like "use only
strict functions in monad-chains". I'm hoping the Haskell community has
some experience with this kind of issue to offer some more helpful advice,
or even tools to detect unsafe combinations. Maybe a discussion of safe
programming patterns would be a useful interim step?
(e.g. Ketil Z. Malde's suggestion of renaming the function to
hUnsafeGetContents maybe a small step in the right direction?)
...
Thinking some more... I'm reminded of some discussions I had a few years
ago about the timing of calls to Java finalizers, and problems this could
cause for network I/O programs because using finalizers to close network
sockets would lead to unexpected resource problems. The only reliable
solution was to always close the sockets explicitly when done. With Java,
coming from C/C++, it was possible to get into a mindset that automatic
memory management also meant automatic management of all resources,
including all those that weren't directly visible to the programmer. Maybe
there's a similar trap for the unware in Haskell?
Anyway, my thoughts are leading me to the idea that the problem is a
disconnect (lack of formal connection or interlock) between the actions of
opening a file, reading its contents and closing it. For example, one
could imagine a structure:
hSafeGetContents :: Handle -> (String -> a) -> a
hSafeGetContents handle function =
function $ hUnsafeGetContents handle
Now the result string can be as lazy as you like, but I think one can
guarantee that the handle won't be closed until the function has used as
much of the content as it may need.
#g
--
At 13:27 20/05/03 -0700, Hal Daume III wrote:
>Yes. This is because hGetContents (and hence readFile, etc.) use lazy
>IO. Just as in this case you might want hClose to force the file to be
>read, in a case like:
>
> > do h <- openFile "really_large_file" ReadMode
> > c <- hGetContents h >>= return . head
> > hClose h
> > return c
>
>you probably don't want the close to read the whole file. I'd argue that
>that problem is not with hClose, but with hGetContents. Really, a strict
>version should be used in most situations. Something like:
>
> > hGetContentsStrict h = do
> > b <- hIsEOF h
> > if b then return [] else do
> > c <- hGetChar h
> > r <- hGetContentsStrict h
> > return (c:r)
>
>of course, you could be smarter with buffering, etc. Another way would be
>to do something using seq/deepSeq.
>
> - Hal
>
>--
> Hal Daume III | hdaume@isi.edu
> "Arrest this man, he talks in maths." | www.isi.edu/~hdaume
>
>On Tue, 20 May 2003, Graham Klyne wrote:
>
> > There seems to be a difficult-to-justify interaction between
> > lazy evaluation and monadic I/O:
> >
> > [[
> > -- file: SpikeIOMonadCloseHandle.hs
> > -- Does hClose force completion of lazy I/O?
> >
> > import IO
> >
> > showFile fnam =
> > do { fh <- openFile fnam ReadMode
> > ; fc <- hGetContents fh
> > ; hClose fh
> > ; putStr fc
> > }
> >
> > test = showFile "SpikeIOMonadCloseHandle.hs"
> > ]]
> >
> > If I load this into Hugs and run it, the output is a single blank line.
> >
> > If I reverse the order of hClose and putStr, the source code is displayed.
> >
> > I think I can understand why this is happening, but it seems to me that
> there's
> > a violation of referential transparency here: I can't see any reasonable
> > justification for the value of 'fc' to vary depending on whether it's
> actually
> > used before or after some other I/O operation.
> >
> > I suppose I was expecting the call of hClose to force complete evaluation
> > of any value that depends on the state prior to hClose. I've no idea if
> > there's a reasonable way to implement that.
> >
> > My concern is that this weakens the claim for monads that they provide
> > a seamless integration between pure functional and stateful code; cf.:
> > [[
> > We believe that, on the contrary, there are very significant differences
> > between
> > writing programs in C and writing in Haskell with monadic state
> > transformers and
> > IO:
> > [...]
> > - Usually, most of the program is neither stateful nor directly
> concerned with
> > IO. The monadic approach allows the graceful coexistence of a small amount
> > of imperative code and the large purely functional part of the program
> > [...]
> > - The usual coroutining behaviour of lazy evaluation, in which the
> consumer of
> > a data structure coroutines with its producer, extends to stateful
> computation
> > as well. As Hughes argues (Hughes 1989), the ability to separate what is
> > computed from how much of it is computed is a powerful aid to writing
> modular
> > programs
> > ]]
> > -- http://research.microsoft.com/Users/simonpj/Papers/state-lasc.ps.gz
> >
> > #g
> >
> >
> > -------------------
> > Graham Klyne
> > <GK@NineByNine.org>
> > PGP: 0FAA 69FF C083 000B A2E9 A131 01B9 1C7A DBCA CB5E
> >
> > _______________________________________________
> > Haskell mailing list
> > Haskell@haskell.org
> > http://www.haskell.org/mailman/listinfo/haskell
> >
-------------------
Graham Klyne
<GK@NineByNine.org>
PGP: 0FAA 69FF C083 000B A2E9 A131 01B9 1C7A DBCA CB5E