[Haskell-cafe] Another optimization question

Sat May 17 18:00:48 EDT 2008

Am Samstag, 17. Mai 2008 22:48 schrieb anton muhin:
>
> Why not -O3?

As far as I know - and Brandon S.Allbery said so, too - -O3 isn't better than 
-O2.
>
> > Using a real list of primes,
>
> What's the size of the real list?

arbitrary, however, since it's [Int], it will actually be at most 105097565 
primes long, but since only numbers up to 5000 are checked, only 670 primes 
will ever be considered - When I check numbers 1 to 50000 (5133 primes, so 
5134 primes evaluated), with -O0 / -O2, it's
Jeroen 1 : 14.5 s / 2.4 s
Jeroen 2 : 52.5 s / 49.7 s
(== x) . head . dropWhile (< x) : 8.2 s /4.1 s
go primes : 5.5 s / 2.5 s

So Jeroen 1 is the best to be optimised :)

> >> but another
> >> implementation of isPrime:
> >>
> >> isPrime x = (== x) $ head $ dropWhile (< x) primes
> >
> > With -O2, this is about 20% slower than the Jeroen's first version,
> > without optimisations 50% faster.
> > Strange.
>
> Well, head has its overhead as well.  Cf. two variants:
>
> firstNotLess :: Int -> [Int] -> Int
> firstNotLess s (x:xs) = if x < s then firstNotLess s xs else x
>
> dropLess :: Int -> [Int] -> [Int]
> dropLess s l@(x:xs) = if x < s then dropLess s xs else l
>
> isPrime :: Int -> Bool
> isPrime x = x == (firstNotLess x primes)
>
> isPrime' :: Int -> Bool
> isPrime' x = x == (head $ dropLess x primes)
>
> On my box firstNotLess gives numbers pretty close (if not better) than

for primes up to 50000, that's
6.8 s / 2.1 s with -O0 / -O2 respectively on mine

> Jeroen's first variant, while head $ dropLess notably worse.

5.8 s / 2.4 s here.
>
> > isPrime :: Int -> Bool
> > isPrime x = go primes
> >      where
> >        go (p:ps) = case compare x p of
> >                        LT -> False
> >                        EQ -> True
> >                        GT -> go ps
> >
> > does best (on my box), with and without optimisations (very very slightly
> > with -O2) for a list of real primes, but not for [1 .. ].
>
> And what happens for [1..]?

With -O2, Jeroen 1 was best (1.62), nex firstNotLess (1.63), head . dropLess 
(1.74), then in close succesion Jeroen 2 (1.92), go primes (1.96) and head . 
dropWhile (1.99),
with -O0, it's
head . dropWhile (1.7 s, YES, that actually is faster with -O0 than with 
-O2!), head . dropLess (2.0), Jeroen 2 and firstNotLess (2.1 s), go primes 
(2.3 s), Jeroen 1 (3.2 s).

Weirder and weirder.

>
> > However, more than can be squished out of fiddling with these versions
> > could be gained from a better algorithm.
>
> Definitely.
>
> yours,
> anton.