[Haskell] Files and lazyness

Cale Gibbard cgibbard at gmail.com
Mon Aug 1 11:04:44 EDT 2005


Your problem is, as you pointed out, that readFile does lazy IO.
Although the semantics of it can be a bit confusing at times, it is
useful for applications where you have a large file which is being
consumed, and you don't want to allocate all of the memory for it
before doing any processing. Laziness lets you read the file as needed
-- you may not even need it all, depending on what is being done. This
is quite helpful when you have something like a couple gigabytes of
data on disk which needs processing. It can however be confusing at
first that it may not finish reading the file before the file is
altered, or, in situations involving handles, before the handle is
closed.

You can write a strict IO version of readFile in Haskell as follows:

import IO

hGetContents' hdl = do e <- hIsEOF hdl
                       if e then return []
                            else do c <- hGetChar hdl
                                    cs <- hGetContents' hdl
                                    return (c:cs)

readFile' fn = do hdl <- openFile fn ReadMode
                  xs <- hGetContents' hdl
                  hClose hdl
                  return xs

If you use readFile', it will ensure that the entire file is read and
memory for the string is allocated before continuing. This ought to
solve your problem.

(Aside: I think this sort of thing should get included in the
libraries, if for no other reason than that this issue comes up from
time to time, and it would be handy to not have to write extra code to
do strict IO.)

hope this helps,
 - Cale

On 28/07/05, Diego y tal <deigote at tiscali.es> wrote:
> I was developing a web site using haskell programs as cgi's, and I found
> a strange behavior that I would like to know whether it is normal. I
> have reduced the "problem" to the next program:
> 
> fEntrada = "fich.txt"
> fSalida = "fich.txt"
> 
> creaFich :: IO()
> creaFich = writeFile fEntrada "me molo"
> 
> main :: IO ()
> main = do x <- readFile fEntrada
> --          print x -- In the second try, uncomment this line
>           writeFile fSalida ""
>           writeFile fSalida x
> 
> Running the next commands (suposing that $ is the prompt of a linux
> shell and main> is the prompt of hugs)
> 
> main> creaFich
> main> main
> $ cat fich.txt
> 
> will give us different results if we comment or uncomment the second
> line of the main body, although the meaning of the program is the same.
> I understand that this is caused by the lazyness, that doesn't evaluate
> the expression "x <- readFile fEntrada" until it's necessary, but.. is
> it normal that we have to think about this "problem" when programming?
> In my case, this behavior caused my program to fail (and it was really
> complicated to find why) and the only solution I found was indeed
> printing to a scratch file the string I was reading inmediatly after
> reading it (really, after telling hugs to read it). I find this a great
> disadvantage as oppose of the imperative paradigm, overall because is
> like having to control something (near to concurrence if you ask me)
> that has not been asked for! I 'll be thankful of any comments or
> replies you send to me.
> 
> Greetings, Diego y tal <deigote at gmail.com>
>


More information about the Haskell mailing list