[Haskell-cafe] getContents and lazy evaluation
Robert Dockins
robdockins at fastmail.fm
Fri Sep 1 17:36:08 EDT 2006
On Friday 01 September 2006 16:46, Duncan Coutts wrote:
> On Fri, 2006-09-01 at 16:28 -0400, Robert Dockins wrote:
> > On Friday 01 September 2006 15:19, Tamas K Papp wrote:
> > > Hi,
> > >
> > > I am newbie, reading the Gentle Introduction. Chapter 7
> > > (Input/Output) says
> > >
> > > Pragmatically, it may seem that getContents must immediately read an
> > > entire file or channel, resulting in poor space and time performance
> > > under certain conditions. However, this is not the case. The key
> > > point is that getContents returns a "lazy" (i.e. non-strict) list of
> > > characters (recall that strings are just lists of characters in
> > > Haskell), whose elements are read "by demand" just like any other
> > > list. An implementation can be expected to implement this
> > > demand-driven behavior by reading one character at a time from the
> > > file as they are required by the computation.
> > >
> > > So what happens if I do
> > >
> > > contents <- getContents handle
> > > putStr (take 5 contents) -- assume that the implementation
> > > -- only reads a few chars
> > > -- delete the file in some way
> > > putStr (take 500 contents) -- but the file is not there now
> > >
> > > If an IO function is lazy, doesn't that break sequentiality? Sorry if
> > > the question is stupid.
> >
> > This is not a stupid question at all, and it highlights the main problem
> > with lazy IO. The solution is, in essence "don't do that, because Bad
> > Things will happen". It's pretty unsatisfactory, but there it is. For
> > this reason, lazy IO is widely regarded as somewhat dangerous (or even as
> > an outright misfeature, by a few).
> >
> > If you are going to be doing simple pipe-style IO (ie, read some data
> > sequentially, manipulate it, spit out the output), lazy IO is very
> > convenient, and it makes putting together quick scripts very easy.
> > However, if you're doing something more advanced, you'd probably do best
> > to stay away from lazy IO.
>
> Since working on Data.ByteString.Lazy I'm now even more of a pro-lazy-IO
> zealot than I was before ;-)
>
> In practise I expect that most programs that deal with file IO strictly
> do not handle the file disappearing under them very well either.
That's probably true, except for especially robust applications where such a
thing is a regular (or at least expected) event.
> At best
> the probably throw an exception and let something else clean up. The
> same can be done with lazy I, though it requires using imprecise
> exceptions which some people grumble about. So I would contend that lazy
> IO is actually applicable in rather a wider range of circumstances than
> you might. :-)
Perhaps I should be more clear. When I said "advanced" above I meant "any use
whereby you treat a file as random access, read/write storage, or do any kind
of directory manipulation (including deleting and or renaming files)". Lazy
I/O (as it currently stands) doesn't play very nice with those use cases.
I agree generally with the idea that lazy I/O is good. The problem is that it
is a "leaky abstraction"; details are exposed to the user that should ideally
be completely hidden. Unfortunately, the leaks aren't likely to get plugged
without pretty tight operating system support, which I suspect won't be
happening anytime soon.
> Note also, that with lazy IO we can write really short programs that are
> blindingly quick. Lazy IO allows us to save a copy through the Handle
> buffer.
> BTW in the above case the "bad thing that will happen" is that contents
> will be truncated. As I said, I think it's better to throw an exception,
> which is what Data.ByteString.Lazy.hGetContents does.
Well, AFAIK, the behavior is officially undefined, which is my real beef. I
agree that it _should_ throw an exception.
> Duncan
--
Rob Dockins
Talk softly and drive a Sherman tank.
Laugh hard, it's a long way to the bank.
-- TMBG
More information about the Haskell-Cafe
mailing list