unneeded laziness case

Mon Jun 11 14:48:16 EDT 2007

People,

A couple of times I suffered of unneeded strictness in Haskell.
And now unneeded laziness bites.
Consider the following program:

-------------------------------------------------------
main = 
  let mK = <formed in a certain complex way> 
                                            :: Map.Map Partition Integer 
      row      = [0, 1, 0 ... 0] 
      (mK', p) = decompose mK row
      infoK'   = sum $ map sum $ mapmap (\ n -> rem n 2) $ Map.elems mK'
  in
  putStr (shows                p 
                -- seq infoK' 
          "\n")
-------------------------------------------------------

In  ghc-6.6,  it needs at least  100 Mb  of heap to perform.
If I set   seq infoK'   between `shows' and `p', then it suffices  
                                                                14 Mb.
If I set there   seq mK',  then again it needs  100 Mb.
7 times as large.

mK  is a triangular  n by n  matrix,  n  is about  900
    (so, there are about 1 million of elements in  mK). 
mK  has small integers in it.
p   is found actually by solving the linear system  X x mK = row.  

Now: how to program real problems in Haskell ?
Should I try to set  seq  everywhere and see whether it helps ?

Also note:  seq mK'  was not sufficient. I do not know, probably, as  
mK' has structure,  seq  does not evaluate all its levels.
Therefore the programmer needs to invent a function that gathers much
of information of mK' and puts it into one Integer
(infoK' = sum ... mK'). Then,  seq infoK'  forces most of evaluation 
of  mK'. 
Is this a generic style to program real problems ? 
To write all this strange code instead of plain 

   let (_, p) = decompose mK row  in  putStr (shows p "\n")
?

The story was as follows.
My friend asked me to decompose certain large symmetric polynomial
into elementary symmetric ones. This was needed for a real mathematical
problem. I applied my DoCon program, which is written in Haskell.
1000 Mb RAM was not enough. I started to investigate, because I 
suspected that this case needs several times smaller memory. I found
a fragment of the type shown above. And I never use `seq'. 
I have put a bit smaller example to make it 300 Mb sufficient.   
The effect was discovered by occasion. I have forgotten that `decompose'
returns a pair with a large matrix as fst and wrote

   let  p = decompose mK row  in  putStr $ show p

It has computed in a relatively small memory, and fast, but before the  
p  value, it printed several pages of the matrix value.    
I said "Oh! we need to improve this, we only need p, the second part":

  let  (_, p) = decompose mK row  in  putStr $ show p

And this takes  7 times  more of memory and 2 times more of time!
Then, I understood, what is happening and verified the idea by 
introducing  seq.

And now, I wonder ...

Thank you in advance for your help and comments. 

Regards,

-----------------
Serge Mechveliani
mechvel at botik.ru