Compiler optimizations questions for ghc 6.10...
Krasimir Angelov
kr.angelov at gmail.com
Sat Feb 21 04:23:39 EST 2009
How mod is affected by the change in quot? Currently mod is defined as:
a `mod` b
| b == 0 = divZeroError
| a == minBound && b == (-1) = overflowError
| otherwise = a `modInt` b
and modInt is defined via remInt# which is primitive. Did you change
the definition of mod as well?
I agree that this change in the definition of quot is just a
workaround. The best solution would be to do something clever in the
compiler itself. For now if someone wants fast division then he/she
should use quotInt.
On Sat, Feb 21, 2009 at 9:52 AM, Bertram Felgenhauer
<bertram.felgenhauer at googlemail.com> wrote:
> Krasimir Angelov wrote:
>> Well I actually did, almost. I added this function:
>>
>> quotX :: Int -> Int -> Int
>> a `quotX` b
>> | b == 0 = error "divZeroError"
>> | b == (-1) && a == minBound = error "overflowError"
>> | otherwise = a `quotInt` b
>>
>> It does the right thing. However to be sure that this doesn't
>> interfere with some other GHC magic the real quot function have to be
>> changed and tested. I haven't build GHC from source for 2-3 years now
>> and I don't have the time to do it just to test whether this works.
>
> It works, and it has the desired effect. It's not always a win though;
> the nofib prime sieve benchmark suffers.
>
> For a patch, see
> http://int-e.home.tlink.de/haskell/ghc-libraries-base-tune-division.patch
>
> Nofib results extract:
>
> ------------------------------------------------------------------------
> Program Size Allocs Runtime Elapsed
> ------------------------------------------------------------------------
> fish -0.7% -5.3% 0.05 0.04
> primes -0.0% +28.5% +25.6% +25.5%
> wheel-sieve2 -0.0% -0.3% -17.9% -18.6%
> ------------------------------------------------------------------------
> Min -0.9% -5.3% -17.9% -18.6%
> Max +0.1% +28.5% +25.6% +25.5%
> Geometric Mean -0.2% +0.2% -0.0% +0.2%
>
> 'primes' is an outlier - the other slowdowns are below 3%
>
> What happens in 'primes' is that 'mod' no longer gets inlined;
> apparently it now looks bigger to the compiler than before.
>
> Using -funfolding-use-threshold=10 brings the benchmark back to its
> original speed, despite the extra comparison before doing the
> division.
>
> In 'wheel-sieve2', the first different optimization choice I see is
> again a 'mod' that is no longer inlined; this leads to a change in other
> inlining choices that result in a speedup for reasons that I have not
> investigated.
>
> Bertram
> _______________________________________________
> Glasgow-haskell-users mailing list
> Glasgow-haskell-users at haskell.org
> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
>
More information about the Glasgow-haskell-users
mailing list