[Haskell-cafe] Re: Another optimization question

anton muhin antonmuhin at gmail.com
Sun May 18 08:54:12 EDT 2008


Any chances you're using Integer instead of Int?  On my box switch to
Integer is quite expensive (as one could expect).

yours,
anton.

On Sun, May 18, 2008 at 9:45 AM, Jeroen <yrn001 at gmail.com> wrote:
> Daniel Fischer <daniel.is.fischer <at> web.de> writes:
>>
>> Am Samstag, 17. Mai 2008 22:48 schrieb anton muhin:
>> >
>> > Why not -O3?
>>
>> As far as I know - and Brandon S.Allbery said so, too - -O3 isn't better than
>> -O2.
>> >
>> > > Using a real list of primes,
>> >
>> > What's the size of the real list?
>>
>> arbitrary, however, since it's [Int], it will actually be at most 105097565
>> primes long, but since only numbers up to 5000 are checked, only 670 primes
>> will ever be considered - When I check numbers 1 to 50000 (5133 primes, so
>> 5134 primes evaluated), with -O0 / -O2, it's
>> Jeroen 1 : 14.5 s / 2.4 s
>> Jeroen 2 : 52.5 s / 49.7 s
>> (== x) . head . dropWhile (< x) : 8.2 s /4.1 s
>> go primes : 5.5 s / 2.5 s
>>
>> So Jeroen 1 is the best to be optimised :)
>>
>> > >> but another
>> > >> implementation of isPrime:
>> > >>
>> > >> isPrime x = (== x) $ head $ dropWhile (< x) primes
>> > >
>> > > With -O2, this is about 20% slower than the Jeroen's first version,
>> > > without optimisations 50% faster.
>> > > Strange.
>> >
>> > Well, head has its overhead as well.  Cf. two variants:
>> >
>> > firstNotLess :: Int -> [Int] -> Int
>> > firstNotLess s (x:xs) = if x < s then firstNotLess s xs else x
>> >
>> > dropLess :: Int -> [Int] -> [Int]
>> > dropLess s l@(x:xs) = if x < s then dropLess s xs else l
>> >
>> > isPrime :: Int -> Bool
>> > isPrime x = x == (firstNotLess x primes)
>> >
>> > isPrime' :: Int -> Bool
>> > isPrime' x = x == (head $ dropLess x primes)
>> >
>> > On my box firstNotLess gives numbers pretty close (if not better) than
>>
>> for primes up to 50000, that's
>> 6.8 s / 2.1 s with -O0 / -O2 respectively on mine
>>
>> > Jeroen's first variant, while head $ dropLess notably worse.
>>
>> 5.8 s / 2.4 s here.
>> >
>> > > isPrime :: Int -> Bool
>> > > isPrime x = go primes
>> > >      where
>> > >        go (p:ps) = case compare x p of
>> > >                        LT -> False
>> > >                        EQ -> True
>> > >                        GT -> go ps
>> > >
>> > > does best (on my box), with and without optimisations (very very slightly
>> > > with -O2) for a list of real primes, but not for [1 .. ].
>> >
>> > And what happens for [1..]?
>>
>> With -O2, Jeroen 1 was best (1.62), nex firstNotLess (1.63), head . dropLess
>> (1.74), then in close succesion Jeroen 2 (1.92), go primes (1.96) and head .
>> dropWhile (1.99),
>> with -O0, it's
>> head . dropWhile (1.7 s, YES, that actually is faster with -O0 than with
>> -O2!), head . dropLess (2.0), Jeroen 2 and firstNotLess (2.1 s), go primes
>> (2.3 s), Jeroen 1 (3.2 s).
>>
>> Weirder and weirder.
>>
>> >
>> > > However, more than can be squished out of fiddling with these versions
>> > > could be gained from a better algorithm.
>> >
>> > Definitely.
>> >
>> > yours,
>> > anton.
>>
>
> Thanks for the responses so far!
>
> I only tested with -O2 and my primes implementation
> is the Sieve of Eratosthenes and has signature
>
> primes :: Integral a => [a]
>
> What's also quite strange is that experiment2 is about
> 10 times time slower than experiment1 when
> using primes (with the Eratosthenes formula) instead of [1..].
>
> I redid the experiments with -prof and the output was quite revealing:
>
> experiment1 (fastest):
> total time  =        2.64 secs   (132 ticks @ 20 ms)
> total alloc =     323,356 bytes  (excludes profiling overheads)
>
>                        individual    inherited
> COST CENTRE   entries  %time %alloc   %time %alloc
>
> MAIN                 0   0.0    0.0   100.0  100.0
>  main                1   0.0    0.5     0.0    0.5
>  CAF                 4   0.0    0.0   100.0   99.0
>  primes             1   9.8   61.9     9.8   61.9
>  main               0   0.0   37.1    90.2   37.1
>   isPrime        5000  90.2    0.0    90.2    0.0
>  CAF                 4   0.0    0.4     0.0    0.4
>
>
> experiment2 (slowest):
> total time  =        6.12 secs   (306 ticks @ 20 ms)
> total alloc = 350,473,356 bytes  (excludes profiling overheads)
>
>                      individual    inherited
> COST CENTRE entries  %time %alloc   %time %alloc
>
> MAIN              0   0.0    0.0   100.0  100.0
>  main             1   0.0    0.0     0.0    0.0
>  CAF              4   0.0    0.0   100.0  100.0
>  primes          1   0.0    0.1     0.0    0.1
>  main            0   0.0    0.0   100.0   99.9
>   isPrime     5000 100.0   99.9   100.0   99.9
>  CAF              4   0.0    0.0     0.0    0.0
>
>
> Would this be only because isPrime of experiment 2 builds
> this temporary list (takeWhile) all the time?
>
> Jeroen Baekelandt
>
>
>
>
>
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>


More information about the Haskell-Cafe mailing list