inside the GHC code generator

Sun Mar 12 06:40:33 EST 2006

On 2/24/06, Simon Peyton-Jones <simonpj at microsoft.com> wrote:
> | last days i studied GHC code generator, reading -ddumps of all sorts,
> | GHC sources, various papers and John's appeals :)
> |
> | what i carried from this investigation:
> |
> | GHC has great high-level optimization. moreover, GHC now allows to
> | program STG almost directly. when i look at STG code i don't see
> | anything that can be done better - at least for these simple loops i
> | tried to compile. i see just unboxed variables, primitive operations
> | and simple loops represented as tail recursion. fine.
> |
> | then STG compiled to the simplified C (called abstract C earlier and
> | quasi C-- now), what is next can be translated:
> |
> | * to real C and then compiled by gcc
> | * to assembler by build-in simple C-- compiler
> | * to assembler by some external C-- compiler (at least it is
> | theoretically possible)
>
> Good stuff Bulat.  There's plenty of interesting stuff to be done here.
>
> However, let me strongly urge you *not* to focus attention primarily on
> the gcc route.  Compiling via C has received a lot of attention over the
> years, and there are many papers describing cool hacks for doing so.
> GHC does not do as well as it could.
>
> But there are serious obstacles.  That's not gcc's fault -- it wasn't
> designed for this.  Accurate GC is one of them, tail calls is another,
> and there are plenty more smaller things that bite you only after you've
> invested a lot of time.  This way lies madness.
>
> C-- was *designed* for this purpose.  GHC uses C-- as its intermediate
> language (just before emitting C).  So a good route is this:
>
> * Write C-- to C-- optimisations
>
> * Then, if you like, translate that code to C.  Already you will be
> doing better than GHC does today, because the C-- to C-- optimiser will
> let you generate better C
>
> * But you can also emit C--, or native code; both of these paths will
> directly benefit from your C-- optimisations.
>
> The point is that none of this relies on Quick C--; you can always use
> an alternative back end.
>
>
>
> You can almost do this today. GHC uses C-- as an intermediate language.
> But alas, GHC's code generator does not take advantage of C--'s native
> calls or parameter passing.  Instead, it pushes things on an auxiliary
> stack etc.  (Those Sp memory operations you see.)  This isn't necessary.
> We are planning to split GHC's code generator into two parts
>         A) Generate C-- with native calls, with an implicit C-- stack
>         B) Perform CPS conversion, to eliminate all calls in favour of
> jumps
>                 using an explicit stack
> The current code gen is A followed by B.  But A is a much more suitable
> optimisation platform, and gives more flexibility.
>
> Chris Thompson, and undergrad at Cambridge, is doing (B) as his
> undergrad project, although it remains to be seen whether he'll have
> enough complete to be usable.
>
> Another shortcoming is that the native code generator in GHC isn't
> capable of dealing with backward jumps to labels (because GHC hasn't
> needed that so far).  But if you did C-- optimisation, you'd probably
> generate such jumps.  It'd be great to beef up the native code gen to
> handle that.
>
>
> Many of the optimisations you describe (perhaps all) are readily
> expressible in the C-- intermediate language, and by working at that
> level you will be independent of with the back end is gcc, a native code
> generator, or Quick C--, or some other C-- compiler.  Much better.

What optimizations are we talking about here? The loop optimizations
that Bulat implicitly proposed would only affect recursion over
unboxed arguments, and, since that's fairly rare, wouldn't give Joe
Hacker any noticeable speed up.
Are we at the end of what we can get without whole-program
optimizations or are there other optimizations that apply to C--
representing a lazy PL?

--
Friendly,
  Lemmih