unneeded laziness case
Serge D. Mechveliani
mechvel at botik.ru
Mon Jun 11 14:48:16 EDT 2007
People,
A couple of times I suffered of unneeded strictness in Haskell.
And now unneeded laziness bites.
Consider the following program:
-------------------------------------------------------
main =
let mK = <formed in a certain complex way>
:: Map.Map Partition Integer
row = [0, 1, 0 ... 0]
(mK', p) = decompose mK row
infoK' = sum $ map sum $ mapmap (\ n -> rem n 2) $ Map.elems mK'
in
putStr (shows p
-- seq infoK'
"\n")
-------------------------------------------------------
In ghc-6.6, it needs at least 100 Mb of heap to perform.
If I set seq infoK' between `shows' and `p', then it suffices
14 Mb.
If I set there seq mK', then again it needs 100 Mb.
7 times as large.
mK is a triangular n by n matrix, n is about 900
(so, there are about 1 million of elements in mK).
mK has small integers in it.
p is found actually by solving the linear system X x mK = row.
Now: how to program real problems in Haskell ?
Should I try to set seq everywhere and see whether it helps ?
Also note: seq mK' was not sufficient. I do not know, probably, as
mK' has structure, seq does not evaluate all its levels.
Therefore the programmer needs to invent a function that gathers much
of information of mK' and puts it into one Integer
(infoK' = sum ... mK'). Then, seq infoK' forces most of evaluation
of mK'.
Is this a generic style to program real problems ?
To write all this strange code instead of plain
let (_, p) = decompose mK row in putStr (shows p "\n")
?
The story was as follows.
My friend asked me to decompose certain large symmetric polynomial
into elementary symmetric ones. This was needed for a real mathematical
problem. I applied my DoCon program, which is written in Haskell.
1000 Mb RAM was not enough. I started to investigate, because I
suspected that this case needs several times smaller memory. I found
a fragment of the type shown above. And I never use `seq'.
I have put a bit smaller example to make it 300 Mb sufficient.
The effect was discovered by occasion. I have forgotten that `decompose'
returns a pair with a large matrix as fst and wrote
let p = decompose mK row in putStr $ show p
It has computed in a relatively small memory, and fast, but before the
p value, it printed several pages of the matrix value.
I said "Oh! we need to improve this, we only need p, the second part":
let (_, p) = decompose mK row in putStr $ show p
And this takes 7 times more of memory and 2 times more of time!
Then, I understood, what is happening and verified the idea by
introducing seq.
And now, I wonder ...
Thank you in advance for your help and comments.
Regards,
-----------------
Serge Mechveliani
mechvel at botik.ru
More information about the Glasgow-haskell-users
mailing list