# Understanding core2core optimisation pipeline

Jan Stolarek jan.stolarek at p.lodz.pl
Thu Jul 24 12:54:56 UTC 2014

```Devs,

I'm trying to understand how the core2core pipeline works. Sadly, we don't have a wiki page about
this so the "only" source of information are the papers and the source code. Papers give pretty
much detail about each transformation in separate but none of the papers gives a comprehensive
and up-to-date overview of how the whole pipeline is structured. My questions are based on
reading the user documentation and the following papers:

[1] - "The Glasgow Haskell Compiler" from The Architecture of Open Source Application, vol. 2
[2] - "Compilation by Transformation in Non-Strict Functional Languages", PhD by Santos
[3] - "Secrets of the Glasgow Haskell Compiler inliner"
[4] - "A transformation-based optimiser for Haskell"
[5] - "Modular, Higher-Order Cardinality Analysis in Theory and Practice"
[6] - "Let-floatig: moving bindings to give faster programs"
[7] - "Playing by the Rules: Rewriting as a practical optimisation technique in GHC"

I know there are several papers missing from this list, eg. "Constructed Product Result Analysis
for Haskell" or "Call-pattern specialisation for Haskell programs". The reason is that these
optimisations are beyond the scope of what I'm doing at the moment (or so I believe).

This mail basically asks just one question: what is the order of optimizations pefromed on Core?
Since this question is pretty big and general I've separated it into smaller questions that arose
from reading the above papers, documentation, and experimenting with GHC.

Now the detailed questions:

1. What is the difference between a "simplifier iteration" and "simplifier phase"? Section
7.20.6.5 of the user guide mentions phases but I believe that iterations are not explained
anywhere. My best guess, expressed in pseudo-code, is this (sorry about the imperative style):

foreach (i in iterations) {
// some optimisations here?
foreach (p in phases) {
//...optimisations here
}
// some optimisations here?
}

1a. What is the default maximum iterations count? User documentation does not specify that.

2. How can I observe the effects of `-ddump-simpl-phases`. I tried compiling several different
programs and this flag seems to have no effect (ie. nothing gets printed).

3. Cardinality anlaysis and inlining: cardinality analysis can determine that a let binding is
used exactly once. Can the inliner re-use this information from the cardinality analysis or does
it recompute it per [3], section 3.1?

4. I've compiled a sample program using `-dverbose-core2core` and got the following phases:
- Desugar (after optimization)
- Simplifier (Phase = InitialPhase [Gentle])
- Specialise
- Float out
- Float inwards
- Simplifier (Phase = 2 [main])
- Simplifier (Phase = 1 [main])
- Simplifier (Phase = 0 [main])
- Demand analysis
- Worker Wrapper binds
- Simplifier (Phase = 0 [post-worker-wrapper])
- Float out
- Common sub-expression
- Float inwards
- Simplifier (Phase = 0 [final])
- Tidy Core
- CorePrep

This raises lots of questions:

4a. The first phase is "Desugar (after optimization)". What optimizations are performed during
desugaring?

4b. I'm not sure whether I'm looking at a single iteration of core2core transformation or at
multiple ones. Some passes are performed several times (Float out, Float inwards), which suggests
that there might be many iterations here. On the other hand simplifier phases are decreasing
towards 0, which looks as if it was one core2core iteration. My assumption here is that every
time a new core2core iteration starts the simplifier phases are counted anew from 2 towards 0. Is
that correct?

4c. Why are there several 0 phases of the Simplifier? I find it confusing.

4d. I understand that some passes can be enabled or disabled using command-line options. Can the
decission to run some passes be made dynamically by the compiler (eg. to run extra simplifier
passes)?

4e. Are there more phases that could appear here, ie. they were ommited with -O?

4f. "Levels added" pass before the "Float out" pass: my guess is that this is preparation for the
full laziness transform. So, is full laziness performed as part of "Float out" pass?

A general note is that I am confused by many Simplifier passes being interleaved with other
passes. I expected that simplifier phases will grouped into a single pass, as speculated in
question 1.

5. What optimizations *exactly* are performed by the Simplifier? I assume that most of what's
described in chapter 3 of [2]: beta reduction, let elimination, case elimination, case floating,
constant folding and eta expansion. I'm not sure about floating let outwards and inwards - [1],
pg. 7, says these are in a pass separate from the simplifier. `-dverbose-core2core` seems to
confirm that since it reveals separate "Float out" and "Float inwards" passes.

6. [4], pg. 31, mentions the Deforestation optimisation. Is everything described in
that "Deforestation" section subsumed by cardinality analysis ([5], end of section 2.1 and
section 7.1)? If not then when is deforestation performed?

7. [5], section 6.1 says: "We run the analysis twice, once in the middle of the optimisation
pipeline and once near the end". When exactly in the middle of the pipeline? Between which
passes? This does not show up with `-dverbose-core2core` (or at least it is not explicitly
named).

8. How does the rules rewriting fit into the picture? Section 7.20.6.5 of the User Guide and
section 4.1 of [7] explain the interaction between rules and inlining and my guess is that both
are performed by the Simplifier. Again the "simplifier phases/iterations" distinction puzzles me
as to what exactly is happening when. Within a single phase is the inlining happening before
rewriting or vice versa?

I know that all of the above questions can be answered by looking at the source code for
sufficiently long. This is actually what I'm planning to do next but if anyone could help me by
answering some of these questions this would certainly save me some time. My plan is to gather up
the answers on a wiki page.

Janek
```