[Haskell-cafe] Re: Processing of large files
Scott Turner
p.turner at computer.org
Mon Nov 1 23:11:12 EST 2004
On 2004 November 01 Monday 16:48, Alexander N. Kogan wrote:
> Sorry, I don't understand. I thought the problem is in laziness -
You're correct. The problem is laziness rather than I/O.
> my list
> of tuples becomes ("qqq", 1+1+1+.....) etc and my program reads whole file
> before it starts processing. Am I right or not? If I'm right, how can I
> inform compiler that my list of tuples should be strict?
The program does not read the whole file before processing the list. You might
expect that it would given that most Haskell I/O take place in exactly the
sequence specified. But readFile is different and sets things up to read the
file on demand, analogous to lazy evaluation.
The list of tuples _does_ need to be strict. Beyond that, as Ketil Malde said,
you should not use foldl -- instead, foldl' is the best version to use when
you are recalculating the result every time a new list item is processed.
To deal with the list of tuples, you can use 'seq' to ensure that its parts
are evaluated.
For example, change
(a,b+1):xs
to
let b' = b+1 in b' `seq` ((a,b'):xs)
'seq' means evaluate the first operand (to weak head normal form) prior to
delivering the second operand as a result. Similarly the expression
merge xs x
needs to be evaluated explicitly.
More information about the Haskell-Cafe
mailing list