[Haskell-cafe] Newbie question
Cale Gibbard
cgibbard at gmail.com
Fri Aug 12 16:57:50 EDT 2005
This question comes up often, and there are lots of ways to answer it.
Not too long ago, I posted a simplistic (but not terribly practical)
solution, writing a strict version of readFile in terms of hGetChar,
to which Simon Marlow replied with an actually pratical implementation
which allocates memory enough for the file, reads the file into that
memory, and only lazily converts it into a string. That discussion
seems to have been immortalised at
http://users.aber.ac.uk/afc/stricthaskell.html which I think has
already been mentioned. (As a side note, I disagree with that author's
views on laziness at the top of the page -- it's usually lazy
evaluation that I want because it increases compositionality of the
code so much. Only very rarely do I care for things to be strict.)
One other (again, not terribly practical, but nice to know about)
thing that you can do is to force Haskell to evaluate all of the
characters in the returned list (which will automatically close the
handle for you). This isn't such a good thing to do if the file you're
reading is huge, as it will consume time and memory for the whole
file, and it's quite possible to run out of stack space, but if you
really want the file to be read and closed before continuing, it will
make sure of that. In addition to this, there may be other reasons for
wanting to control the strictness and order of evaluation, so it's
probably useful to know in any event.
The IO action to use is called
evaluate :: a -> IO a
and is located in Control.Exception. What evaluate does is to force
its parameter to be evaluated so far as the topmost node in the data
structure. We want to completely evaluate the list returned by
fmap lines . hGetContents
not just the first cell, so we'll mapM_ evaluate over the list as follows:
do hdl <- openFile "myFile" ReadMode
ls <- fmap lines (hGetContents hdl)
mapM_ evaluate ls
-- at this point, hdl is closed, and ls contains the contents
-- of the whole file so we can go on to use it...
If we want to get carried away, we can also (ab)use the strategies in
Control.Parallel.Strategies to get things to be evaluated as far as we
might like. There is a function there:
rnf :: Strategy a
(where Strategy a is a synonym for a -> ()) which when evaluated,
reduces its argument to normal form before returning (). Basically, it
will completely evaluate the data structure as far as possible. While
it works for a bunch of types in the class NFData (anything which is a
combination of lists, arrays and tuples) it would be nice to be able
to derive it for new data types like:
data MyType = ... deriving (NFData)
While as far as I know there isn't a GHC extension to derive NFData,
there is a GHC extension to derive another class which is somewhat
more general in its scope. This class is called Data, and is in
Data.Generics. With it, provided I haven't made any mistakes, we can
write our own version of rnf as follows:
rnf :: (Data a) => a -> ()
rnf x = everything seq (\y -> y `seq` ()) x
The function seq :: a -> b -> b is a primitive in Haskell for creating
strictness. The expression (seq x y), when evaluated, will cause x to
be evaluated (to weak head normal form) before returning y. The
everything function is a sort of generalisation of folding which is
available on any data type in the class Data, which is actually most
common types. You can read about it in
http://homepages.cwi.nl/~ralf/syb1/
The overall effect of evaluating (rnf x) should then be to completely
evaluate x, and return (), which is what we want.
To parallel some other things people have done, we can write
deepSeq x y = rnf x `seq` y
as well as
f $!! x = rnf x `seq` f x
the latter of which is a strict form of function application.
When we want these functions to apply to our own data types, it will
suffice to just write "deriving Data" on the end of the data
declaration.
I'm not sure how relevant or comprehensible this will seem to someone
new to Haskell, but I hope it's somewhat interesting at least, and
maybe some other people on the list will find bits of it useful too.
- Cale Gibbard
On 12/08/05, André Vargas Abs da Cruz <andrev at ele.puc-rio.br> wrote:
> Hi everyone,
>
> I think this is a totally newbie question as i am a complete novice
> to Haskell. I am trying to write down a few programs using GHC in order
> to get used with the language. I am having some problems with a piece of
> code (that is supposed to return a list of lines from a text file) which
> I transcribe below:
>
> module Test where
>
> import IO
>
> readDataFromFile filename = do
> bracket (openFile filename ReadMode) hClose
> (\h -> do contents <- hGetContents h
> return (lines contents))
>
> The question is: if i try to run this, it returns me nothing (just
> an empty list). Why does this happen ? When i remove the "return" and
> put a "print" instead, it prints everything as i expected.
>
> Thanks in advance
> André
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
More information about the Haskell-Cafe
mailing list