[GHC] #12618: Add saturated constructor applications to Core

Mon Oct 24 14:56:05 UTC 2016

#12618: Add saturated constructor applications to Core
-------------------------------------+-------------------------------------
        Reporter:  simonpj           |                Owner:  nomeata
            Type:  feature request   |               Status:  new
        Priority:  normal            |            Milestone:
       Component:  Compiler          |              Version:  8.0.1
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  None/Unknown      |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by nomeata):

 Allright, it is done, and I can report back.

 Introducing `ConApp` (without any compression) was quite tedious, and for
 me, rather two than one week. By now, it validates (almost; the GHCi
 debugger shows some difference in behaviour that I did not investigate)
 and shows the same runtime performance.

 The most tricky points are related to rules, which really do not like it
 if eta-expansion changes the Core too much. It took me a while to get
 equivalent program output after this refactoring (and I took a shortcut
 for now, duplicating some list fusion rules that match `build (:)` to have
 a second variant matching `build (\x y -> x : y)`). With a bit more
 careful work, this could be fixed, should we want this code to be merged.

 This change on its own affected some compiler perf benchmarks in the test
 suite: T9961 improves by 13%, T9233 by 2.5%., T9020 by 2.5%. T4801
 regresses by 3.87% (bytes allocated)

 I then, in a separate patch, implemented omitting redundant type arguments
 from constructors such as `Just`, `(:)` and tuples, which was the main
 motivation here. At every use of `ConApp`, I tried to understand the code
 as to whether it actually cares about the type arguments (which means that
 the they need to be recovered) or not (in which case the compressed
 argument list can be traversed, which is of course more efficient).

 In these places I had to uncompress the type arguments:

 * `freeVars` (which also calculates the types)
 * The linter
 * `cpeRhsE`
 * `exprIsConApp_maybe`
 * `exprType`
 * `collectStaticPtrSatArgs`, `sptModuleInitCode` (all about static pointer
 tables)
 * `decomposeRuleLhs` in the desugarer
 * `toIfaceExpr` when serializing tuples (low-hanging fruit here)
 * `match_magicDict`
 * `occAnal`
 * `simplExprF1`
 * `isValue` is `specConstr`

 I found that it is crucial to analyze the type of the data constructor
 only once, and store the “compression scheme” (i.e. which type arguments
 to recover form which term arguments) once in the data constructor, as
 this analysis is not completely cheap.

 But even with this optimization in place, the effect of this is –
 neglectible.

 My gut feeling:
 * `ConApp` is not a good idea in this form. Constructor applications are
 still just applications, and treating them that separately is not going to
 be healthy. It might be a better idea to make ''all'' applications
 saturated (as in strict core, or less invasive, “spotty types”).
 * The compression scheme is nice in principle, but there are still too
 many code paths that want the types. Some might be taken off the list
 after careful analysis of the code and mild refactoring. In others, making
 sure that the type in a `Type` data constructor is used as lazily as
 possible might help avoiding actually running `exprType` (this is a blind
 guess).
 * Furthermore, the large types that occur with nested tuples are already
 in the type checker! So avoiding them in Core is only half the story.
 * If compression would make a difference, then I think we want it at all
 applications (or at least applications headed by an `Id`, where we could
 store the compression scheme). Another point in favor of making all
 applications saturated.

 As for the problem of nested tuples: Maybe it would have been better to
 first carefully analyze the compiler (using `-v`/profiling/ticky-ticky) to
 be sure where we pay the unwanted cost (type checking, Core, interfaces,
 somewhere else) to know what we have to fix, before having a shot at one
 assumed cause.

 The code is at Phab:D2564.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/12618#comment:36>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler