[Haskell-beginners] Code help requested

Daniel Fischer daniel.is.fischer at web.de
Wed Jan 13 17:11:02 EST 2010

Am Mittwoch 13 Januar 2010 20:58:17 schrieb Tim Perry:
> A side benefit of using foldl' instead of foldr is that I can now run it
> against the entire dictionary!
> I found a good explanation of why here:
> http://www.haskell.org/haskellwiki/Foldr_Foldl_Foldl%27

Yep. One thing, though: seq, and hence foldl' only evaluates to weak head 
normal form (WHNF), that is (leaving aside lambda expressions), to the 
topmost constructor.
So, e.g. list `seq` value only checks whether list is [] or (_:_), that may 
not be enough strictness in a fold. The constructors of Map are strict 
enough to avoid large thunks in general with foldl', but for example to 
compute the average of a list of numbers:

average :: [Double] -> Double
average list = sumList / countList
      (sumList,countList) = foldl' add (0,0) list
      add (s,c) x = (s+x,c+1)

isn't strict enough, sumList and countList will be large thunks because in 
each step add (s,c) x will be evaluated enough only to see that it is 
indeed a pair.

For such cases, one needs to force the evaluation further by hand.
One possibility is to write a stricter function:

add' (s,c) x
   = let s1 = s+x
         c1 = c+1
     in s1 `seq` c1 `seq` (s1,c1)

or, using BangPatterns:

add' (!s,!c) x = (s+x,c+1)

Another possibility is to use a sufficiently strict data type

data DPair = DP !Double !Double

add (DP s c) x = DP (s+x) (c+1)

(the strict fields will keep the numbers completely evaluated at each 
step), a third is to use rnf {- reduce to normal form -} from 
Control.Parallel.Strategies or another sufficiently strict strategy, 
respectively deepseq.

Which one is the best choice varies of course.

More information about the Beginners mailing list