Understanding core2core optimisation pipeline
Simon Peyton Jones
simonpj at microsoft.com
Thu Jul 24 23:26:41 UTC 2014
| My plan is to gather up
| the answers on a wiki page.
Excellent -- please do that! My replies are below
| This mail basically asks just one question: what is the order of
| optimizations pefromed on Core?
It's entirely defined by
SimplCore.getCoreToDo :: DynFlags -> [CoreToDo]
The code there should be reasonably self-explanatory. The type signature is very descriptive.
| 1. What is the difference between a "simplifier iteration" and
| "simplifier phase"?
Roughtly, a complete run of the simplifier means "run the simplifier repeatedly until nothing further happens". The iterations are the successive iterations of this loop. Currently there's a (rather arbitrary) limit of four such iterations before we give up and declare victory.
| 1a. What is the default maximum iterations count? User documentation does
| not specify that.
| 2. How can I observe the effects of `-ddump-simpl-phases`. I tried
| compiling several different
| programs and this flag seems to have no effect (ie. nothing gets
I had to read the source code. I think you say
to dump the output of phases called A,B,C. But no, it seems that it only affects output of simplifier statistics (see simplifyPgmIO). I have never used this flag. Maybe it can go. Looks strange to me.
| 3. Cardinality anlaysis and inlining: cardinality analysis can determine
| that a let binding is
| used exactly once. Can the inliner re-use this information from the
| cardinality analysis or does
| it recompute it per , section 3.1?
Cardinality analysis determines that something is *demanded* once. The occurrence analyser determines when it *occurs* once. For example
if .. then x else x+1
x occurs twice, but is demanded once. So they are different. The inliner uses occurrence information.
| 4a. The first phase is "Desugar (after optimization)". What optimizations
| are performed during
Just a few basic ones; see CoreSubst.simpleOptPgm. It implements the "Very Simple Optimiser" which is only a page or two of code. Reading it and writing a Note that enumerates what optimisations it does would be a Good Thing.
| 4b. I'm not sure whether I'm looking at a single iteration of core2core
| transformation or at
| multiple ones. Some passes are performed several times (Float out, Float
| inwards), which suggests
| that there might be many iterations here. On the other hand simplifier
| phases are decreasing
| towards 0, which looks as if it was one core2core iteration. My
| assumption here is that every
| time a new core2core iteration starts the simplifier phases are counted
| anew from 2 towards 0. Is
| that correct?
No. getCoreToDo produces a list of CoreToDos. Each specifies a stage in the pipeline. One such stage is a run of the simplifier. Such a run has a "phase" number, which is set in getCoreToDo. This phase number is used (only) to control INLINE pragmas and RULES (see extensive documentation in the user manual e.g http://www.haskell.org/ghc/docs/latest/html/users_guide/pragmas.html#phase-control)
| 4c. Why are there several 0 phases of the Simplifier? I find it
We need to run the simplifier several times to propagate the effects of (say) strictness analysis or let-floating. But by that stage the need for staging RULES and INLINE pragmas is over.
| 4d. I understand that some passes can be enabled or disabled using
| command-line options. Can the
| decission to run some passes be made dynamically by the compiler (eg. to
| run extra simplifier
Yes. Plug-ins do precisely that, by manipulating the [CoreToDo] pipeline list.
| 4e. Are there more phases that could appear here, ie. they were ommited
| with -O?
Could appear where?
| 4f. "Levels added" pass before the "Float out" pass: my guess is that
| this is preparation for the
| full laziness transform. So, is full laziness performed as part of "Float
| out" pass?
The full laziness pass is simply a combination of
set-levels (an analysis)
float out (which transforms the program)
| A general note is that I am confused by many Simplifier passes being
| interleaved with other
| passes. I expected that simplifier phases will grouped into a single
| pass, as speculated in
| question 1.
Many passes produce output that can readily be simplified. Rather than require them to perform those simplifications, we delegate it to the simplifier.
| 5. What optimizations *exactly* are performed by the Simplifier? I assume
| that most of what's
| described in chapter 3 of : beta reduction, let elimination, case
| elimination, case floating,
| constant folding and eta expansion. I'm not sure about floating let
| outwards and inwards - ,
| pg. 7, says these are in a pass separate from the simplifier. `-dverbose-
| core2core` seems to
| confirm that since it reveals separate "Float out" and "Float inwards"
I don't have an exhaustive list, I'm afraid. The general rule is: if it can be done with local knowledge, the simplifier should do it. Making a full list would be another good exercise. If you start I could add more. Examples:
- constant foldings
- applying RULES
- inlining (a really important one)
- case of case
- case of known constructor
- eta expansion and eta reduction
- combining adjacent casts
- pushing a cast out of the way of an application
e.g. (f |> k) a ==> f (a |> k1) |> k2
for suitable k1, k2
and there are probably a lot more. Look for SimplifierTicks; every time the simplifier does something significant it bumps a tick count. (or should do so.)
| 6. , pg. 31, mentions the Deforestation optimisation. Is everything
| described in
| that "Deforestation" section subsumed by cardinality analysis (, end
| of section 2.1 and
| section 7.1)? If not then when is deforestation performed?
Deforestation is simply the application of rewrite RULES; that's done by the simplifier.
| 7. , section 6.1 says: "We run the analysis twice, once in the middle
| of the optimisation
| pipeline and once near the end". When exactly in the middle of the
| pipeline? Between which
| passes? This does not show up with `-dverbose-core2core` (or at least it
| is not explicitly
I'm not actually certain that cardinality analysis *is* run twice in HEAD. Maybe it was only in the version for the paper. Ilya can tell us
| 8. How does the rules rewriting fit into the picture?
It's done by the simplifier
More information about the ghc-devs