[Haskell-cafe] Re: Difference in Runtime but no explanation

Tue Dec 15 16:45:29 EST 2009

Am Dienstag 15 Dezember 2009 21:43:46 schrieb Johann Höchtl:
> Please describe for me as a beginner, why there _is_ a difference:
>
> 1. does len (x:xs) l = l `seq` len xs (l+1) vs. len xs $! (l+1) expand
> into sthg. different?

Yes. How different depends on optimisation level and compiler version.

Without optimisations, seq is inlined, giving one self-contained loop, while ($!) is not 
inlined, so in every iteration of the loop there's a call to ($!) - bad for performance.
ghc-6.10.3 and ghc-6.12.1 produce nearly identical core for each of the functions.

With optimisations (-O2), both functions get compiled into a self-contained loop, 6.12.1 
produces near identical core for the two functions. The core for the second contains one 
case-expression the core for the first doesn't, but they should produce the same assembly.
One improvement versus the unoptimised core is that plusInteger is called instead of 
GHC.Num.+.

6.10.3 produces very different core with -O2. The core for the second variant is close to 
that which 6.12.1 produces, I know not enough about core to see how that would influence 
performance. For the first variant, 6.10.3 produces different core, special casing for 
small and large Integers, which proves to be more efficient. Again, I'm not specialist 
enough to know why a) it produces so different core b) that core is so much faster.

> 2. Do I understand right, that the first expression "should" actually
> be slower but (for what reason ever in an unoptimized case isn't?

No. In principle, with len (x:xs) l = l `seq` len xs (l+1), the evaluation of l lags one 
step behind, so you'd have the reduction
len [1,2,3] 0
~> len [2,3] (0+1)
~> len [3] (1+1)
~> len [] (2+1)
~> (2+1)
~> 3
while len (x:xs) l = let l' = l+1 in l' `seq` len xs l' gives
len [1,2,3] 0
~> len [2,3] 1
~> len [3] 2
~> len [] 3
~> 3
but that difference isn't measurable (if they produce different machine instructions at 
all, the difference is at most a handful of clock cycles).
*If* len xs $! (l+1) were expanded into the latter, both would be - for all practical 
purposes at least - equally fast.

> 3. The function is anotated with Integer. Why is suddenly Int of
> importance?

Thomas DuBuisson tried Int to investigate. It's always interesting what changes when you 
change types.

> (4. When optimizing is switched on, the second expession executes
> faster; as such I assume, that there is a difference between these two
> statements)

Not here. With 6.12.1 and -O(2), both are equally fast, with 6.10.3, the first is faster. 
I would rather expect 6.10.4 to behave more like 6.10.3. It may be, of course, that it's a 
hardware/OS issue which code is faster.
Can you

ghc-6.10.4 -O2 -fforce-recomp -ddump-simpl --make Whatever.hs > Whatever.core

so I can see what core that produces?

>
> Thank you!