[Haskell-cafe] Re: different performance of equivalent expression

Sat Jan 13 07:40:25 EST 2007

> I've run into strange effect that I can not explain. I have simple
> expression that can be written by two equivalent ways. However one way
> give much performance gain over another. Here is an example:
> 
> -- apply function many times (tail-recursive)
> many n f x = if n == 0 then x else many (n-1) f $! (f x)
> 
> -- first adder function
> adder1 = let r = many 5000000 sin 1.0 in \x -> x + r
> 
> -- second adder function
> adder2 = \x -> x + many 5000000 sin 1.0
>
> If you run program it think some seconds performing math, and them
> prints 4 results immediately. But with adder2 function, it perform
> calculation in every call, which can be seen visually.
>
> It seems that compiler is able to "cache" in some way long computation
> in first case, but not in second.

This is indeed the case and entirely reasonable. The effect is called
"memoization", a key ingredient to lazy evaluation. To simplify the
explanation, let's take the examples

    adder1 = let r = many 50000000 (1+) 0 in \x -> x + r
    adder2 = \x -> let s = many 50000000 (1+) 0 in x + s

The evaluation of (adder1 3) proceeds as follows:

    adder1 3
  = (let r = many 50000000 (1+) 0 in \x -> x + r) 3
  = let r = many 50000000 (1+) 0 in 3 + r
  = let r = 50000000 in 3 + r
  = 50000003

Now, (r) will be updated with the result (50000000) after it has been
calculated and subsequent access to (r) will retrieve the updated value
as in

    adder1 4
  = (let r = 50000000 in \x -> x + r) 4
  = let r = 50000000 in 4 + r
  = 50000004

Every incarnation of (adder1) shares the same r. For (adder2), things
are different. Here, (s) will be updated as well, but different
incarnations of (adder2) do not share the same (s) because (s) is only
in scope after supplying some argument (x). Hence, every (adder2 3) and
(adder3 4) (re)calculates its own (s).

> I always thought that let a = b in x + a is just a syntactic sugar for x
> + b. Is it wrong?

This is correct but doesn't apply to the case above. The question here
is whether

    let a = b in \x -> x + a
  and
    \x -> let a = b in x + a

are equivalent. Considering the result, they are. But considering
running time and memory footprint, they are not. The first trades memory
for speed, the second trades speed for memory. In general, the compiler
is reluctant to switch between those two versions, i.e. it does not
perform much common subexpression elimination or "let floating" (see GHC
manual). The choice must be up to the programmer.

Regards,
apfelmus