[Haskell-cafe] Fusing foldr's

Josef Svenningsson josef.svenningsson at gmail.com
Fri Oct 26 10:34:11 EDT 2007

Sorry for reacting so late on this mail. I'm digging through some old mails...

On 10/12/07, Dan Weston <westondan at imageworks.com> wrote:
> Always check optimizations to make sure they are not pessimizations!
> Actually, traversing the list twice is very cheap compared to space
> leakage, and accumulating pairs requires tuple boxing and unboxing which
> I don't know how to get GHC not to do.
I agree hole-heartedly that replacing multiple traversals with a
single traversal should be done with care as it more often than not
results in a pessimization. Indeed you showed just that with your
examples! But I'd thought it'd be interesting to see how it can
actually be an improvement if done carefully.

import Control.Arrow
import qualified Data.Strict.Tuple as T
import Data.Strict.Tuple (Pair(..))
import Control.Parallel

avg4 = uncurry (/) . (foldl' (+) 0 &&& foldl' (\x y -> x + 1) 0)
avgS = T.uncurry (/) . foldl' (\p n -> ((+n) *!* (+1)) p) (0 :!: 0)
avgP = uncurry (/) . (foldl' (+) 0 &!& foldl' (\x y -> x + 1) 0)

(*!*) f g (a :!: b) = f a :!: g b

(&!&) f g a = fa `par` (fa,ga)
  where fa = f a
        ga = g a

avg4 is the function which was best among Dan's benchmarks. avgS uses
strict tuples. I just threw in avgP for fun, it traverses the lists in
parallel. Note: I do have a dual core machine so it makes sense to try

I fed these functions to ghc with the -O2 and -threaded flags and
timed them using the list [1..10000000]. The result (best times out of
several runs):
avg4: 284 ms
avgS: 184 ms
avgP: 248 ms

It seems doing a single traversal can be faster if your write your
function carefully. Doing the traversal in parallel was beneficial but
not as good as the single traversal.



More information about the Haskell-Cafe mailing list