[Haskell-beginners] Question about IO, particularly hGetContents

Kyle Murphy orclev at gmail.com
Fri Apr 2 10:28:29 EDT 2010


I'm a bit of a beginner so I might be wrong, but I think do only forces
evaluation at the level of the do block, not recursively. Think of it this
way, you've got a series of function calls, which are represented as thunks,
and which in turn return more thunks. If you use do notation to execute that
series of functions they all get evaluated in order, but you're still left
with more un-evaluated
thunks because that's what the functions returned. To use your program as an
example it gets evaluated as follows:

> import IO
> main = do -- starts do block
>      rhdl<- openFile "test.in" ReadMode -- thunk to open test.in for
reading, I'm not sure if it actually opens test.in at this point, or if that
happens when you call hGetContents or hClose.
>      content<- hGetContents rhdl -- thunk to get a lazy read into rhdl and
assign that thunk to content

At this point content hasn't been evaluated, it's still a thunk, *but*
hGetContents has been evaluated and produced the thunk that's stored in
content.

>      hClose rhdl -- thunk to close rhdl

This thunk closes rhdl, and gets evaluated after the thunk that gets the
thunk to read the contents of rhdl.

>      putStrLn "Content: " -- thunk to write "Content: " to stdout
>      putStrLn content      -- thunk to write content to stdout, this will
force evaluation of the thunk stored in content

This last thunk attempts to evaluate the thunk stored in content, which in
turn tries to read from rhdl, which it can't because rhdl has already been
closed.

If you enable the bang patterns extension you could fix this I think by
doing:

> {-# LANGUAGE  BangPatterns #-}
> import IO
> main = do
>      rhdl<- openFile "test.in" ReadMode
>      !content <- hGetContents rhdl
>      putStrLn "Content: "
>      putStrLn content

Which would force the evaluation of the thunk stored in content. The
difference between the non-bang pattern version and the above is fairly
trivial in this case, as even if you didn't have the bang on the content
variable it would still be evaluated 2 lines down at the putStrLn call, but
if you had much larger files, and perhaps didn't force evaluation for quite
some time you could quite easily chew up a lot of memory. Lazy IO is both a
blessing and a curse, although the more I see and the more I think about it,
the more I'm in favor of simply not allowing lazy IO, as it's just way too
easy to make mistakes with lazy IO.

-R. Kyle Murphy
--
Curiosity was framed, Ignorance killed the cat.


On Thu, Apr 1, 2010 at 22:36, Ken Overton <koverton at lab49.com> wrote:

> Thanks for the explanation and the warning about hClose.  I thought that
> using do{} blocks forced lazy evaluations?
>
> ________________________________________
> From: Isaac Dupree [ml at isaac.cedarswampstudios.org]
> Sent: Thursday, April 01, 2010 9:22 PM
> To: Ken Overton
> Cc: beginners at haskell.org
> Subject: Re: [Haskell-beginners] Question about IO, particularly
> hGetContents
>
> On 04/01/10 21:06, Ken Overton wrote:
> > Hello all, I was playing with some Haskell code that read in text files
> and processed them and often found myself writing empty result-files.  I've
> pared the problem down to the following small example.
> >
> > -- example program
> > import IO
> > main = do
> >      rhdl<- openFile "test.in" ReadMode
> >      content<- hGetContents rhdl
> >      putStrLn content  -- if I cmt out this line, content will be empty
> >      hClose rhdl
> >      putStrLn "Content: "
> >      putStrLn content      -- if the first 'putStrLn' call was commented,
> this will print a blank line
> > -- end example
> >
> > For some reason, if I comment out the 'putStrLn content' between
> hGetContents and hClose, the data from the hGetContents call is not stored.
>  Can somebody verify this behavior and (if so) explain why it's happening?
>
> Yes.  hGetContents retrieves the contents "lazily", only reading them
> when demanded by the program*.  When it's read all the contents, it
> automatically closes the file, so you shouldn't use hClose in
> combination with it (also it stops any future reading from happening,
> which is why your program broke).
>
> *implemented using the somewhat controversial "unsafeInterleaveIO". Some
> people believe that lazy IO like this is not a good idea, for reasons
> similar to the one you encountered, but in circumstances that are harder
> to fix.
>
> -Isaac_______________________________________________
> Beginners mailing list
> Beginners at haskell.org
> http://www.haskell.org/mailman/listinfo/beginners
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/beginners/attachments/20100402/f51d2410/attachment-0001.html


More information about the Beginners mailing list