Introducing GHC whole program compiler (GHC-WPC)

Csaba Hruska csaba.hruska at
Sun Jun 14 12:45:51 UTC 2020


I thought about the GHC-LTO project name before, but it would not be an
accurate description though. The GHC-WPC in its current state is about
exporting STG + linker info for later processing, either feed it back to
GHC backend or to a third party pipeline. It depends what the
user/researcher wants, the point is that GHC-WPC solves the IR export part
of the issue. It is the external stg compiler that implements a (simple)
whole program dead function elimination pass that I implemented as a proof
of concept to show the new possibilities GHC-WPC opens up. But I plan to do
much more optimization with sophisticated dataflow analyses. I.e. I have a
fast and working implementation of control flow analysis in souffle/datalog
that I plan to use to do more accurate dead code elimination and partial
program defunctionalization on the whole program STG IR. In theory I could
implement all GRIN optimizations on STG. That would mean a significant
conceptual shift in the GHC compiler pipeline, because heavy optimizations
would be introduced at the low level IRs beside GHC Core. I'd like to go
even further with experimentation. I can imagine a dependently typed Cmm
with a similar type system that ATS ( has. I definitely
would like to make an experiment in the future, to come up with an Idirs2
EDSL for GHC RTS heap operations where the type system would ensure the
correctness of pointer arithmetic and heap object manipulation. The purpose
of GHC-WPC in this story is to deliver the IR for these stuff.

Beside exporting STG IR, the external STG compiler can compile STG via
GHC's standard code generator. This makes GHC codegen/RTS available as a
backend for programming language developers. I.e. Idris, Agda, Purescript
could use GHC/STG/RTS as a backend with all of its cool features.

So these are the key parts of my vision about the purpose and development
of GHC-WPC. It is meant to be more than a link time optimizer.


On Sat, Jun 13, 2020 at 10:26 PM Alexis King <lexi.lambda at> wrote:

> Hi Csaba,
> I originally posted this comment on /r/haskell
> <> before
> I saw you also sent this to ghc-devs. I’ve decided to reproduce my comment
> here as well, since this list probably has a more relevant audience:
> I want to start by saying that I think this sounds totally awesome, and I
> think it’s a fantastic idea. I’m really interested in seeing how this
> progresses!
> I do wonder if people might find the name a little misleading. “Whole
> program compilation” usually implies “whole program optimization,” but most
> of GHC’s key optimizations happen at the Core level, before STG is even
> generated. (Of course, I’m sure you’re well aware of that, I’m just stating
> it for the sake of others who might be reading who aren’t aware.)
> This seems much closer in spirit to “link-time optimization” (LTO) as
> performed by Clang and GCC than whole program compilation. For example,
> Clang’s LTO works by “linking” LLVM bitcode files instead of fully-compiled
> native objects. STG is not quite analogous to LLVM IR—GHC’s analog would be
> Cmm, not STG—but I think that difference is not that significant here: the
> STG-to-Cmm pass is quite mechanical, and STG is mostly just easier to
> manipulate.
> tl;dr: Have you considered naming this project GHC-LTO instead of GHC-WPC?
> Alexis
> On Jun 12, 2020, at 16:16, Csaba Hruska <csaba.hruska at> wrote:
> Hello,
> I've created a whole program compilation pipeline for GHC via STG.
> Please read my blog post for the details:
> Introducing GHC whole program compiler (GHC-WPC)
> <>
> Regards,
> Csaba Hruska
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the ghc-devs mailing list