[Haskell-cafe] GHC optimisations

Rodrigo Queiro overdrigzed at gmail.com
Tue Aug 21 08:14:20 EDT 2007


On my system, the C version runs about 9x faster than the haskell
version (with -O3 and -O2 -fvia-c -optc-O3 respectively). However, GCC
seems to produce about 70 lines of assembly for the main loop,
compared to about 10 from GHC. I suspect the speed difference is the
result of some heavy optimisation by GCC, which would need to be
hand-tuned for GHC. (I would be interested to see what this would be.
Unfortunately I don't know x86 assembly well enough to understand the
GCC output.)

On 21/08/07, Donald Bruce Stewart <dons at cse.unsw.edu.au> wrote:
> phil:
> > On Mon, Aug 20, 2007 at 09:57:38PM +0100, Simon Peyton-Jones wrote:
> > >GHC does some constant folding, but little by way of strength
> > >reduction, or using shifts instead of multiplication.  It's pretty
> > >easy to add more: it's all done in a single module.  Look at
> > >primOpRules in the module PrelRules.
> > >
> > >Patches welcome!  But please also supply test-suite tests that check
> > >the correctness of the rules.
> >
> > Sucking another example out of comp.lang.functional:
> >
> > This:
> >
> >   import System
> >
> >   f :: Int -> Int -> Int
> >   f s n = if n > 0 then f (s+n) (n-1) else s
> >
> >   main = do
> >         [n] <- getArgs
> >         putStrLn $ show $ f 0 (read n)
> >
> > is 3-4x slower than this:
> >
> >   #include <stdio.h>
> >   #include <stdlib.h>
> >   #include <assert.h>
> >
> >   int f(int s, int n) {
> >     return n > 0 ? f(s+n, n-1) : s;
> >   }
> >
> >   int main(int argc, char *argv[]) {
> >     assert(argc == 2);
> >     printf("%d\n", f(0, strtol(argv[1],0,0)));
> >   }
> >
> > The generated assembler suggests (if I've read it correctly) that gcc
> > is spotting that it can replace the tail call with a jump in the C
> > version, but for some reason it can't spot it for the Haskell version
> > when compiling with -fvia-C (and neither does ghc itself using
> > -fasm). So the haskell version ends up pushing and popping values on
> > and off the stack for every call to f, which is a bit sad.
> >
>
> That doesn't sound quite right. The C version should get a tail call ,
> with gcc -O2, the Haskell version should be a tail call anyway.
>
> Let's see:
>
> C
>     $ gcc -O t.c -o t
>     $ time ./t 1000000000
>     zsh: segmentation fault (core dumped)  ./t 1000000000
>     ./t 1000000000  0.02s user 0.22s system 5% cpu 4.640 total
>
> Turning on -O2
>
>     $ time ./t 1000000000
>     -243309312
>     ./t 1000000000  1.89s user 0.00s system 97% cpu 1.940 total
>
>
> And GHC:
>
>     $ ghc -O2 A.hs -o A
>     $ time ./A 1000000000
>     -243309312
>     ./A 1000000000  3.21s user 0.01s system 97% cpu 3.289 total
>
> So, what, 1.6x slower than gcc -O2
> Seems ok without any tuning.
>
> -- Don
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>


More information about the Haskell-Cafe mailing list