[Haskell-cafe] Cost: (:) vs head
Bas van Dijk
v.dijk.bas at gmail.com
Sat Sep 11 08:46:48 EDT 2010
On Sat, Sep 11, 2010 at 5:13 AM, michael rice <nowgate at yahoo.com> wrote:
>
> Which of these would be more costly for a long list?
>
> f :: [Int] -> [Int]
> f [x] = [x]
> f (x:xs) = x + (head xs) : f xs
>
> f :: [Int] -> [Int]
> f [x] = [x]
> f (x:y:xs) = x + y : f (y:xs)
Use Criterion[1] to find out:
module Main where
import Criterion.Main
f1, f2, f3, f4 :: [Int] -> [Int]
f1 [x] = [x]
f1 (x:xs) = x + head xs : f1 xs
f2 [x] = [x]
f2 (x:y:xs) = x + y : f2 (y:xs)
f3 [x] = [x]
f3 (x:xs@(y:_)) = x + y : f3 xs
f4 [x] = [x]
f4 (x:y:xs) = x + y : go y xs
where
go x [] = [x]
go x (y:xs) = x + y : go y xs
benchMark s f = bench s $ whnf (\n -> sum $ f [1..n]) 1000000
main = defaultMain [ benchMark "f1" f1
, benchMark "f2" f2
, benchMark "f3" f3
, benchMark "f4" f4
]
now compile and run it:
$ ghc --make Benchmark.hs -O2 -o benchmark
...
$ ./benchmark
warming up
estimating clock resolution...
mean is 24.29944 us (40001 iterations)
found 1405 outliers among 39999 samples (3.5%)
721 (1.8%) high mild
684 (1.7%) high severe
estimating cost of a clock call...
mean is 1.844233 us (45 iterations)
found 2 outliers among 45 samples (4.4%)
2 (4.4%) high severe
benchmarking f1
collecting 100 samples, 1 iterations each, in estimated 7.917595 s
bootstrapping with 100000 resamples
mean: 80.04258 ms, lb 79.85129 ms, ub 80.24094 ms, ci 0.950
std dev: 1.000711 ms, lb 878.8460 us, ub 1.179558 ms, ci 0.950
found 1 outliers among 100 samples (1.0%)
variance introduced by outliers: 0.990%
variance is unaffected by outliers
benchmarking f2
collecting 100 samples, 1 iterations each, in estimated 8.171391 s
bootstrapping with 100000 resamples
mean: 83.13315 ms, lb 82.93615 ms, ub 83.33348 ms, ci 0.950
std dev: 1.017999 ms, lb 904.5153 us, ub 1.174008 ms, ci 0.950
variance introduced by outliers: 0.990%
variance is unaffected by outliers
benchmarking f3
collecting 100 samples, 1 iterations each, in estimated 8.297014 s
bootstrapping with 100000 resamples
mean: 82.66586 ms, lb 82.34780 ms, ub 83.39774 ms, ci 0.950
std dev: 2.339937 ms, lb 976.2940 us, ub 4.133495 ms, ci 0.950
found 9 outliers among 100 samples (9.0%)
7 (7.0%) high mild
2 (2.0%) high severe
variance introduced by outliers: 0.998%
variance is unaffected by outliers
benchmarking f4
collecting 100 samples, 1 iterations each, in estimated 8.080888 s
bootstrapping with 100000 resamples
mean: 80.80089 ms, lb 80.61719 ms, ub 80.99542 ms, ci 0.950
std dev: 968.1706 us, lb 872.7758 us, ub 1.097217 ms, ci 0.950
variance introduced by outliers: 0.990%
variance is unaffected by outliers
So to summarize from fastest to slowest:
f1: mean: 80.04258 ms
f4: mean: 80.80089 ms
f3: mean: 82.66586 ms
f2: mean: 83.13315 ms
To find out why f1 is the fastest you can look at the core using ghc-core[2]:
$ ghc-core -- -O2 Benchmark.hs
f1 :: [Int] -> [Int]
GblId
f1 =
\ (ds_d1h0 :: [Int]) ->
case ds_d1h0 of _ {
[] -> f11;
: x_a12V ds1_d1h1 ->
case ds1_d1h1 of _ {
[] ->
: @ Int x_a12V ([] @ Int);
: ipv_s1hv ipv1_s1hw ->
:
@ Int
(case x_a12V of _ { I# x1_a1k6 ->
case ipv_s1hv of _ { I# y_a1ka ->
I# (+# x1_a1k6 y_a1ka)
}
})
(f1_$sf1 ipv1_s1hw ipv_s1hv)
}
}
f1_$sf1 :: [Int] -> Int -> [Int]
GblId
f1_$sf1 =
\ (sc_s1FL :: [Int]) (sc1_s1FM :: Int) ->
case sc_s1FL of _ {
[] ->
:
@ Int sc1_s1FM ([] @ Int);
: ipv_s1hv ipv1_s1hw ->
:
@ Int
(case sc1_s1FM of _ { I# x_a1k6 ->
case ipv_s1hv of _ { I# y_a1ka ->
I# (+# x_a1k6 y_a1ka)
}
})
(f1_$sf1 ipv1_s1hw ipv_s1hv)
}
f4 :: [Int] -> [Int]
GblId
f4 =
\ (ds_d1gl :: [Int]) ->
case ds_d1gl of _ {
[] -> f41;
: x_a13f ds1_d1gm ->
case ds1_d1gm of _ {
[] ->
: @ Int x_a13f ([] @ Int);
: y_a13h xs_a13i ->
:
@ Int
(plusInt x_a13f y_a13h)
(f4_go y_a13h xs_a13i)
}
}
f4_go =
\ (x_a13k :: Int) (ds_d1gu :: [Int]) ->
case ds_d1gu of _ {
[] ->
: @ Int x_a13k ([] @ Int);
: y_a13m xs_a13n ->
:
@ Int
(plusInt x_a13k y_a13m)
(f4_go y_a13m xs_a13n)
}
f3 :: [Int] -> [Int]
GblId
f3 =
\ (ds_d1gD :: [Int]) ->
case ds_d1gD of _ {
[] -> f31;
: x_a13b ds1_d1gE ->
case ds1_d1gE of _ {
[] ->
: @ Int x_a13b ([] @ Int);
: y_a13e ds2_d1gF ->
:
@ Int
(plusInt x_a13b y_a13e)
(f3_$sf3 ds2_d1gF y_a13e)
}
}
f3_$sf3 :: [Int] -> Int -> [Int]
GblId
f3_$sf3 =
\ (sc_s1G3 :: [Int]) (sc1_s1G4 :: Int) ->
case sc_s1G3 of _ {
[] ->
:
@ Int sc1_s1G4 ([] @ Int);
: y_a13e ds_d1gF ->
:
@ Int
(plusInt sc1_s1G4 y_a13e)
(f3_$sf3 ds_d1gF y_a13e)
}
f2 :: [Int] -> [Int]
GblId
f2 =
\ (ds_d1gP :: [Int]) ->
case ds_d1gP of _ {
[] -> f21;
: x_a137 ds1_d1gQ ->
case ds1_d1gQ of _ {
[] ->
: @ Int x_a137 ([] @ Int);
: y_a139 xs_a13a ->
:
@ Int
(plusInt x_a137 y_a139)
(f2_$sf2 xs_a13a y_a139)
}
}
f2_$sf2 :: [Int] -> Int -> [Int]
GblId
f2_$sf2 =
\ (sc_s1FU :: [Int]) (sc1_s1FV :: Int) ->
case sc_s1FU of _ {
[] ->
:
@ Int sc1_s1FV ([] @ Int);
: y_a139 xs_a13a ->
:
@ Int
(plusInt sc1_s1FV y_a139)
(f2_$sf2 xs_a13a y_a139)
}
The reason that f1 is faster than the rest is that GHC is somehow able
to unpack the Ints and use the more efficient +# instead of the slower
plusInt.
I don't immediately see the reason for the time difference between f4,
f3 and f2. The inner loops all seem equivalent.
Regards,
Bas
[1] http://hackage.haskell.org/package/criterion
[2] http://hackage.haskell.org/package/ghc-core
More information about the Haskell-Cafe
mailing list