[Haskell-cafe] questions about core

Mon Feb 12 02:06:39 EST 2007

On Mon, Feb 12, 2007 at 04:45:47PM +1100, Matt Roberts wrote:
> Hi all,
> 
> I am trying to get a deeper understanding of core's role in GHC and  
> the compilation of functional languages in general.  So far I have  
> been through
>  - The hackathon videos,
>  - "A transformation-based optimiser for Haskell",
>  - "An External Representation for the GHC Core Language (DRAFT for  
> GHC5.02)", and
>  - "Secrets of the Glasgow Haskell Compiler inliner".
> 
> and I am still a bit hazy on a few points:
>  - What role do *the semantics* of core play (i.e. how and where are  
> they taken advantage of)?

They aren't, at least not directly by the compiler - the semantics are
what give people named Simon the courage to implement counterintuitive
optimizations without losing sleep.  (since they know formally nothing
can go wrong)

>  - Exactly what are the operational and denotational semantics of core?
>  - The headline reasons (and any other arguments that emerge) for  
> having core *and* stg as separate definitions.
> 
> I have an intuition on a few of these points, but I would love  
> something concrete to latch on to.

(disclaimer: I am not a GHC hacker, nor have I ever gotten around to actually
writing an optimizer.)

It's a simple question of expediency and tradeoffs.

Core has lots of nice mathematical properties.  For instance, Core's
call-by-name properties allow the compiler to simply prune unreferenced
expressions, without worrying about changing termination behavior.  Core
expressions can be rearranged by nice pattern matches, and as a strongly
typed calculus Core is relatively immune to misoptimization.  On the other
hand, Core is rather far from the machine, and much is still implicit -
invisible and unoptimizable.  If GHC only used Core, you would get all the
nice large-scale optimizations (fusion comes immediately to mind), but you
would pay full price for every forced closure etc - wasted effort could not
be optimized away.

STG is much closer to the machine.  If GHC's desugarer produced STG and Core
were removed completely, GHC would still be able to produce nearly perfect code.
Every optimization that could be performed at Core, can in principle be
performed at STG.  In practice things are a bit different.  STG, as an impure,
strict, untyped langage, is missing the nice properties of Core, and many
optimizations that are a no-brainer to write at the Core level, are riddled
with side conditions at the lower level.

So, to summarize:

We have Core because Simon lacks the patience to solve the halting problem and
properly perform effects analysis on STG.

We have STG because Simon lacks the patience to wait for the 6.6 Simplifier to
finish naively graph-reducing every time.

(now let's hope that MY intuitions are on the right track!)

Stefan