[Haskell-cafe] Another optimization question

Sun May 18 12:03:13 EDT 2008

Am Sonntag, 18. Mai 2008 14:50 schrieb anton muhin:
> On Sun, May 18, 2008 at 2:00 AM, Daniel Fischer
>
> <daniel.is.fischer at web.de> wrote:
> > Am Samstag, 17. Mai 2008 22:48 schrieb anton muhin:
> >> Why not -O3?
> >
> > As far as I know - and Brandon S.Allbery said so, too - -O3 isn't better
> > than -O2.
>
> I didn't know that, sorry.

No need to apologise. It was a perfectly reasonable question. Fortunately, 
Brandon just a few minutes before confirmed that my recollection was probably 
correct, otherwise I would've added "must check that, though".

>
> >> > Using a real list of primes,
> >>
> >> What's the size of the real list?
> >
> > arbitrary, however, since it's [Int], it will actually be at most
> > 105097565 primes long, but since only numbers up to 5000 are checked,
> > only 670 primes will ever be considered - When I check numbers 1 to 50000
> > (5133 primes, so 5134 primes evaluated),
>
> So it's just a relatively sparsed sorted list of 5134 numbers and the
> greatest of them still fits into Int, correct?

Technically, I think, it's a partially evaluated list with a thunk at the end 
that says how to get more when needed. But at the end, it's basically a list 
of 5134 Ints.

>
> > with -O0 / -O2, it's
> > Jeroen 1 : 14.5 s / 2.4 s
> > Jeroen 2 : 52.5 s / 49.7 s
>
> probably Jeroen's hypothesis about temporary list built might explain
> that slowdown, what do you think?

Probably, but I'm a bit surprised how much that building of lists costs.

>
> > (== x) . head . dropWhile (< x) : 8.2 s /4.1 s
> > go primes : 5.5 s / 2.5 s
> >
> > So Jeroen 1 is the best to be optimised :)
> >
> :)
> :
> >> >> but another
> >> >> implementation of isPrime:
> >> >>
> >> >> isPrime x = (== x) $ head $ dropWhile (< x) primes
> >> >
> >> > With -O2, this is about 20% slower than the Jeroen's first version,
> >> > without optimisations 50% faster.
> >> > Strange.
> >>
> >> Well, head has its overhead as well.  Cf. two variants:
> >>
> >> firstNotLess :: Int -> [Int] -> Int
> >> firstNotLess s (x:xs) = if x < s then firstNotLess s xs else x
> >>
> >> dropLess :: Int -> [Int] -> [Int]
> >> dropLess s l@(x:xs) = if x < s then dropLess s xs else l
> >>
> >> isPrime :: Int -> Bool
> >> isPrime x = x == (firstNotLess x primes)
> >>
> >> isPrime' :: Int -> Bool
> >> isPrime' x = x == (head $ dropLess x primes)
> >>
> >> On my box firstNotLess gives numbers pretty close (if not better) than
> >
> > for primes up to 50000, that's
> > 6.8 s / 2.1 s with -O0 / -O2 respectively on mine
>
> So it seems to be reproducible.

Yes, though with -O0 there's a big difference.
>
> >> Jeroen's first variant, while head $ dropLess notably worse.
> >
> > 5.8 s / 2.4 s here.

So that doesn't perform notably worse than Jeroen's first variant on my box

> >
> >> > isPrime :: Int -> Bool
> >> > isPrime x = go primes
> >> >      where
> >> >        go (p:ps) = case compare x p of
> >> >                        LT -> False
> >> >                        EQ -> True
> >> >                        GT -> go ps
> >> >
> >> > does best (on my box), with and without optimisations (very very
> >> > slightly with -O2) for a list of real primes, but not for [1 .. ].
> >>
> >> And what happens for [1..]?
> >
> > With -O2, Jeroen 1 was best (1.62), nex firstNotLess (1.63), head .
> > dropLesst (1.74), then in close succesion Jeroen 2 (1.92), go primes
> > (1.96) and head . dropWhile (1.99),
>
> go primes ran 1.96?  Indeed weird, does anybody know if it's due to
> pattern matching?

I also tried a version with
go (p:ps)
	| p < x     = go ps
	| otherwise = p == x
that did worse (not much).

>
> > with -O0, it's
> > head . dropWhile (1.7 s, YES, that actually is faster with -O0 than with
> > -O2!), head . dropLess (2.0), Jeroen 2 and firstNotLess (2.1 s), go
> > primes (2.3 s), Jeroen 1 (3.2 s).
> >
> > Weirder and weirder.
>
> agree
>
> yours,
> anton

Cheers,
Daniel