[Haskell-beginners] Stack space overflow: using strict accumulator still fails

Daniel Fischer daniel.is.fischer at googlemail.com
Thu Oct 27 13:41:58 CEST 2011


On Thursday 27 October 2011, 12:45:11, Hugo Ferreira wrote:
> Hello,
> 
> Have a stack overflow but cannot see why (read up on [1],
> may be missing something trivial). Once again using the
> http://nlpwp.org/ book code. If I call the following function,
> it blows its top:
> 
> scoreRule :: TransformationRule -> Z.Zipper (Tag, Tag) -> Int
> scoreRule r z = nCorrect - nIncorrect
>      where (nCorrect, nIncorrect) = scoreRule_ r z
> 
> scoreRule_ :: TransformationRule -> Z.Zipper (Tag, Tag) -> (Int, Int)
> scoreRule_ r = Z.foldlz' (scoreElem r) (0, 0)
>      where scoreElem r s@(nCorrect, nIncorrect) z =
>                case ruleApplication r z of
>                  Just tag -> if tag == correct then
>                                  (nCorrect + 1, nIncorrect)
>                              else
>                                  (nCorrect, nIncorrect + 1)
>                  Nothing  -> s
>                where (correct, _) = Z.cursor z
> 
> however I see that the eager version of foldlz is being used.
> I also though that maybe ruleApplication my not be executing
> immediately. But I cannot see why (added definition below for
> reference).
> 
> Can anyone point out why this is not strict?

The additions (increments) are never forced before the final subtraction, 
so from scoreRule_ you will probably get a pair of thunks
(((...(0+1)+1...)+1), ((...(0+1)+1...)+1)),
since the forcing in the fold can only use seq, to force *the outermost 
constructor* of the intermediate results, in this case, the outermost 
constructor is the pair constructor - (,) - and the components are left 
unforced.

To force the increments without (big) delay, you can
- use a custom strict pair type instead of ordinary pairs

data P = P !Int !Int  -- {-# UNPACK #-} the fields for extra goodness

so that forcing the outermost constructor automatically forces the 
components.

- make the scoreElem function strict in the components of the accumulator 
s, with ghc {-# LANGUAGE BangPatterns #-},

scoreElem r !s@(!nCorrect, !nIncorrect) z = ...

that way you will never get bigger thungks than (n+1) in the components

- force the updated count as it is constructed,

   if tag == correct
      then let newCorrect = nCorrect+1
           in newCorrect `seq` (newCorrect, nIncorrect)
      else let newIncorrect = ...


The important thing to be aware of is that seq only forces the outermost 
level of a value. If the value is a structure with more levels, it doesn't 
prevent the building of huge thunks in the inner levels at all.
You then have to take care of that yourself, by using a datatype with the 
desired strictness or, in the case of folds and similar, providing a 
comination function with the desired strictness.

HTH,
Daniel



More information about the Beginners mailing list