Understanding core2core optimisation pipeline

Simon Peyton Jones simonpj at microsoft.com
Thu Jul 24 23:26:41 UTC 2014


| My plan is to gather up
| the answers on a wiki page.

Excellent -- please do that!  My replies are below

Simon

| This mail basically asks just one question: what is the order of
| optimizations pefromed on Core?

It's entirely defined by 
  SimplCore.getCoreToDo :: DynFlags -> [CoreToDo]
The code there should be reasonably self-explanatory.  The type signature is very descriptive.


| 1. What is the difference between a "simplifier iteration" and
| "simplifier phase"?

Roughtly, a complete run of the simplifier means "run the simplifier repeatedly until nothing further happens".  The iterations are the successive iterations of this loop. Currently there's a (rather arbitrary) limit of four such iterations before we give up and declare victory.

| 1a. What is the default maximum iterations count? User documentation does
| not specify that.

Four

| 2. How can I observe the effects of `-ddump-simpl-phases`. I tried
| compiling several different
| programs and this flag seems to have no effect (ie. nothing gets
| printed).

I had to read the source code.  I think you say
	"-ddump-simpl-phases=A,B,C"
to dump the output of phases called A,B,C. But no, it seems that it only affects output of simplifier statistics (see simplifyPgmIO).  I have never used this flag.  Maybe it can go.  Looks strange to me.

| 3. Cardinality anlaysis and inlining: cardinality analysis can determine
| that a let binding is
| used exactly once. Can the inliner re-use this information from the
| cardinality analysis or does
| it recompute it per [3], section 3.1?

Cardinality analysis determines that something is *demanded* once.  The occurrence analyser determines when it *occurs* once.  For example
	if .. then x else x+1
x occurs twice, but is demanded once.  So they are different.  The inliner uses occurrence information.

| 4a. The first phase is "Desugar (after optimization)". What optimizations
| are performed during
| desugaring?

Just a few basic ones; see CoreSubst.simpleOptPgm.  It implements the "Very Simple Optimiser" which is only a page or two of code.  Reading it and writing a Note that enumerates what optimisations it does would be a Good Thing.

| 4b. I'm not sure whether I'm looking at a single iteration of core2core
| transformation or at
| multiple ones. Some passes are performed several times (Float out, Float
| inwards), which suggests
| that there might be many iterations here. On the other hand simplifier
| phases are decreasing
| towards 0, which looks as if it was one core2core iteration. My
| assumption here is that every
| time a new core2core iteration starts the simplifier phases are counted
| anew from 2 towards 0. Is
| that correct?

No. getCoreToDo produces a list of CoreToDos.  Each specifies a stage in the pipeline. One such stage is a run of the simplifier.  Such a run has a "phase" number, which is set in getCoreToDo.  This phase number is used (only) to control INLINE pragmas and RULES (see extensive documentation in the user manual e.g http://www.haskell.org/ghc/docs/latest/html/users_guide/pragmas.html#phase-control)

| 4c. Why are there several 0 phases of the Simplifier? I find it
| confusing.

We need to run the simplifier several times to propagate the effects of (say) strictness analysis or let-floating.  But by that stage the need for staging RULES and INLINE pragmas is over.

| 4d. I understand that some passes can be enabled or disabled using
| command-line options. Can the
| decission to run some passes be made dynamically by the compiler (eg. to
| run extra simplifier
| passes)?

Yes. Plug-ins do precisely that, by manipulating the [CoreToDo] pipeline list.

| 4e. Are there more phases that could appear here, ie. they were ommited
| with -O?

Could appear where?

| 4f. "Levels added" pass before the "Float out" pass: my guess is that
| this is preparation for the
| full laziness transform. So, is full laziness performed as part of "Float
| out" pass?

The full laziness pass is simply a combination of 
	set-levels (an analysis)
followed by
	float out (which transforms the program)

| A general note is that I am confused by many Simplifier passes being
| interleaved with other
| passes. I expected that simplifier phases will grouped into a single
| pass, as speculated in
| question 1.

Many passes produce output that can readily be simplified.  Rather than require them to perform those simplifications, we delegate it to the simplifier.

| 5. What optimizations *exactly* are performed by the Simplifier? I assume
| that most of what's
| described in chapter 3 of [2]: beta reduction, let elimination, case
| elimination, case floating,
| constant folding and eta expansion. I'm not sure about floating let
| outwards and inwards - [1],
| pg. 7, says these are in a pass separate from the simplifier. `-dverbose-
| core2core` seems to
| confirm that since it reveals separate "Float out" and "Float inwards"
| passes.

I don't have an exhaustive list, I'm afraid.  The general rule is: if it can be done with local knowledge, the simplifier should do it.  Making a full list would be another good exercise.  If you start I could add more.  Examples:
- constant foldings
- applying RULES
- inlining (a really important one)
- case of case
- case of known constructor
- eta expansion and eta reduction
- combining adjacent casts
- pushing a cast out of the way of an application
    e.g.  (f |> k) a  ==>  f (a |> k1) |> k2
          for suitable k1, k2
and there are probably a lot more.  Look for SimplifierTicks; every time the simplifier does something significant it bumps a tick count.  (or should do so.)

| 6. [4], pg. 31, mentions the Deforestation optimisation. Is everything
| described in
| that "Deforestation" section subsumed by cardinality analysis ([5], end
| of section 2.1 and
| section 7.1)? If not then when is deforestation performed?

Deforestation is simply the application of rewrite RULES; that's done by the simplifier.

| 7. [5], section 6.1 says: "We run the analysis twice, once in the middle
| of the optimisation
| pipeline and once near the end". When exactly in the middle of the
| pipeline? Between which
| passes? This does not show up with `-dverbose-core2core` (or at least it
| is not explicitly
| named).

I'm not actually certain that cardinality analysis *is* run twice in HEAD.  Maybe it was only in the version for the paper.  Ilya can tell us

| 8. How does the rules rewriting fit into the picture? 

It's done by the simplifier




More information about the ghc-devs mailing list