Bas van Dijk v.dijk.bas at gmail.com
Sat Sep 11 08:46:48 EDT 2010

```On Sat, Sep 11, 2010 at 5:13 AM, michael rice <nowgate at yahoo.com> wrote:
>
> Which of these would be more costly for a long list?
>
> f :: [Int] -> [Int]
> f [x] = [x]
> f (x:xs) = x + (head xs) : f xs
>
> f :: [Int] -> [Int]
> f [x] = [x]
> f (x:y:xs) = x + y : f (y:xs)

Use Criterion[1] to find out:

module Main where

import Criterion.Main

f1, f2, f3, f4 :: [Int] -> [Int]

f1 [x] = [x]
f1 (x:xs) = x + head xs : f1 xs

f2 [x] = [x]
f2 (x:y:xs) = x + y : f2 (y:xs)

f3 [x] = [x]
f3 (x:xs@(y:_)) = x + y : f3 xs

f4 [x] = [x]
f4 (x:y:xs) = x + y : go y xs
where
go x []     = [x]
go x (y:xs) = x + y : go y xs

benchMark s f = bench s \$ whnf (\n -> sum \$ f [1..n]) 1000000

main = defaultMain [ benchMark "f1" f1
, benchMark "f2" f2
, benchMark "f3" f3
, benchMark "f4" f4
]

now compile and run it:

\$ ghc --make Benchmark.hs -O2 -o benchmark
...
\$ ./benchmark
warming up
estimating clock resolution...
mean is 24.29944 us (40001 iterations)
found 1405 outliers among 39999 samples (3.5%)
721 (1.8%) high mild
684 (1.7%) high severe
estimating cost of a clock call...
mean is 1.844233 us (45 iterations)
found 2 outliers among 45 samples (4.4%)
2 (4.4%) high severe

benchmarking f1
collecting 100 samples, 1 iterations each, in estimated 7.917595 s
bootstrapping with 100000 resamples
mean: 80.04258 ms, lb 79.85129 ms, ub 80.24094 ms, ci 0.950
std dev: 1.000711 ms, lb 878.8460 us, ub 1.179558 ms, ci 0.950
found 1 outliers among 100 samples (1.0%)
variance introduced by outliers: 0.990%
variance is unaffected by outliers

benchmarking f2
collecting 100 samples, 1 iterations each, in estimated 8.171391 s
bootstrapping with 100000 resamples
mean: 83.13315 ms, lb 82.93615 ms, ub 83.33348 ms, ci 0.950
std dev: 1.017999 ms, lb 904.5153 us, ub 1.174008 ms, ci 0.950
variance introduced by outliers: 0.990%
variance is unaffected by outliers

benchmarking f3
collecting 100 samples, 1 iterations each, in estimated 8.297014 s
bootstrapping with 100000 resamples
mean: 82.66586 ms, lb 82.34780 ms, ub 83.39774 ms, ci 0.950
std dev: 2.339937 ms, lb 976.2940 us, ub 4.133495 ms, ci 0.950
found 9 outliers among 100 samples (9.0%)
7 (7.0%) high mild
2 (2.0%) high severe
variance introduced by outliers: 0.998%
variance is unaffected by outliers

benchmarking f4
collecting 100 samples, 1 iterations each, in estimated 8.080888 s
bootstrapping with 100000 resamples
mean: 80.80089 ms, lb 80.61719 ms, ub 80.99542 ms, ci 0.950
std dev: 968.1706 us, lb 872.7758 us, ub 1.097217 ms, ci 0.950
variance introduced by outliers: 0.990%
variance is unaffected by outliers

So to summarize from fastest to slowest:

f1: mean: 80.04258 ms
f4: mean: 80.80089 ms
f3: mean: 82.66586 ms
f2: mean: 83.13315 ms

To find out why f1 is the fastest you can look at the core using ghc-core[2]:

\$ ghc-core -- -O2 Benchmark.hs

f1 :: [Int] -> [Int]
GblId

f1 =
\ (ds_d1h0 :: [Int]) ->
case ds_d1h0 of _ {
[] -> f11;
: x_a12V ds1_d1h1 ->
case ds1_d1h1 of _ {
[] ->
: @ Int x_a12V ([] @ Int);
: ipv_s1hv ipv1_s1hw ->
:
@ Int
(case x_a12V of _ { I# x1_a1k6 ->
case ipv_s1hv of _ { I# y_a1ka ->
I# (+# x1_a1k6 y_a1ka)
}
})
(f1_\$sf1 ipv1_s1hw ipv_s1hv)
}
}

f1_\$sf1 :: [Int] -> Int -> [Int]
GblId

f1_\$sf1 =
\ (sc_s1FL :: [Int]) (sc1_s1FM :: Int) ->
case sc_s1FL of _ {
[] ->
:
@ Int sc1_s1FM ([] @ Int);
: ipv_s1hv ipv1_s1hw ->
:
@ Int
(case sc1_s1FM of _ { I# x_a1k6 ->
case ipv_s1hv of _ { I# y_a1ka ->
I# (+# x_a1k6 y_a1ka)
}
})
(f1_\$sf1 ipv1_s1hw ipv_s1hv)
}

f4 :: [Int] -> [Int]
GblId

f4 =
\ (ds_d1gl :: [Int]) ->
case ds_d1gl of _ {
[] -> f41;
: x_a13f ds1_d1gm ->
case ds1_d1gm of _ {
[] ->
: @ Int x_a13f ([] @ Int);
: y_a13h xs_a13i ->
:
@ Int
(plusInt x_a13f y_a13h)
(f4_go y_a13h xs_a13i)
}
}

f4_go =
\ (x_a13k :: Int) (ds_d1gu :: [Int]) ->
case ds_d1gu of _ {
[] ->
: @ Int x_a13k ([] @ Int);
: y_a13m xs_a13n ->
:
@ Int
(plusInt x_a13k y_a13m)
(f4_go y_a13m xs_a13n)
}

f3 :: [Int] -> [Int]
GblId

f3 =
\ (ds_d1gD :: [Int]) ->
case ds_d1gD of _ {
[] -> f31;
: x_a13b ds1_d1gE ->
case ds1_d1gE of _ {
[] ->
: @ Int x_a13b ([] @ Int);
: y_a13e ds2_d1gF ->
:
@ Int
(plusInt x_a13b y_a13e)
(f3_\$sf3 ds2_d1gF y_a13e)
}
}

f3_\$sf3 :: [Int] -> Int -> [Int]
GblId

f3_\$sf3 =
\ (sc_s1G3 :: [Int]) (sc1_s1G4 :: Int) ->
case sc_s1G3 of _ {
[] ->
:
@ Int sc1_s1G4 ([] @ Int);
: y_a13e ds_d1gF ->
:
@ Int
(plusInt sc1_s1G4 y_a13e)
(f3_\$sf3 ds_d1gF y_a13e)
}

f2 :: [Int] -> [Int]
GblId

f2 =
\ (ds_d1gP :: [Int]) ->
case ds_d1gP of _ {
[] -> f21;
: x_a137 ds1_d1gQ ->
case ds1_d1gQ of _ {
[] ->
: @ Int x_a137 ([] @ Int);
: y_a139 xs_a13a ->
:
@ Int
(plusInt x_a137 y_a139)
(f2_\$sf2 xs_a13a y_a139)
}
}

f2_\$sf2 :: [Int] -> Int -> [Int]
GblId

f2_\$sf2 =
\ (sc_s1FU :: [Int]) (sc1_s1FV :: Int) ->
case sc_s1FU of _ {
[] ->
:
@ Int sc1_s1FV ([] @ Int);
: y_a139 xs_a13a ->
:
@ Int
(plusInt sc1_s1FV y_a139)
(f2_\$sf2 xs_a13a y_a139)
}

The reason that f1 is faster than the rest is that GHC is somehow able
to unpack the Ints and use the more efficient +# instead of the slower
plusInt.

I don't immediately see the reason for the time difference between f4,
f3 and f2. The inner loops all seem equivalent.

Regards,

Bas