[GHC] #13493: Recompilation avoidance and Backpack

GHC ghc-devs at haskell.org
Tue Mar 28 16:49:05 UTC 2017


#13493: Recompilation avoidance and Backpack
-------------------------------------+-------------------------------------
           Reporter:  ezyang         |             Owner:  (none)
               Type:  bug            |            Status:  new
           Priority:  normal         |         Milestone:
          Component:  Compiler       |           Version:  8.1
           Keywords:  recomp         |  Operating System:  Unknown/Multiple
  backpack                           |
       Architecture:                 |   Type of failure:  None/Unknown
  Unknown/Multiple                   |
          Test Case:                 |        Blocked By:
           Blocking:                 |   Related Tickets:
Differential Rev(s):                 |         Wiki Page:
-------------------------------------+-------------------------------------
 Today, recompilation avoidance is centered around two major mechanisms:

 1. First, we keep track of entities we *use* (`tcg_dus`), which is done by
 reading off all external names from the renamed source code of a Haskell
 source file.

 2. Second, we keep track of what we *import* (`tcg_imports`), which
 tracked when we rename imports.

 These two pieces of information get assembled into a module-indexed series
 of usages in `mk_mod_usage_info`. The general idea is that when an entity
 is used, we must record the hash of the entity; when a module is imported,
 we must record its export hash.

 There is an implicit assumption here, which is that a (direct) import is
 the only way we depend on the exports of a module, and an occurrence of a
 name in the renamed syntax is the only way we depend on an actual entity.

 Backpack breaks these assumptions:

 * When we perform signature merging, we depend on the exports and entities
 of each of the signatures we merge in.  Furthermore, it is important to
 distinguish each of these by identity module (not semantic module, which
 collapses the distinction.)

 * When we instantiate a module, we depend on the exports and entities of
 the implementing module.

 When I initially implemented Backpack, I slowly added extra information to
 fix recompilation problems as I noticed them. I thus accreted the
 following recompilation avoidance mechanisms:

 * When signature merging occurs, we specially record the module hash for
 each used merge requirement as a special new field
 `UsageMergedRequirement`, and recomp if the module hash changed at all. We
 also add each merged signature to ImportAvails (but not as an "import") to
 ensure we pick up family instances.

 * When we instantiate a module, we treat it as if we had a direct import
 of it (not yet merged, in https://phabricator.haskell.org/D3381). Since
 instantiations are always referencing non-local modules, we'll always
 record a module hash in such cases.

 This is quite a hodgepodge, and I have no confidence that it is correct.
 For example, if an implementing module reexports an entity from another
 module, and that original entity changes, I doubt we recompile at this
 point. We "accidentally" handle the case when it's not a reexport because
 we record the module hash for the entire instantiating module.

 It seems that it would be better if we can recast this in terms of imports
 and usages.  Here is a try at the design:

 * In both instantiation and merging, we must record the export hash of the
 modules we instantiated/merged in. It is a little troublesome to think of
 these as imports, however, because they're not (and if you try to
 implement this, you find yourself making a fake ImportedModVal for an
 import that doesn't exist); I think the correct thing here is to introduce
 a new notion of dependency for things that don't correspond to source
 level imports (another possibility is to add another constructor to
 ImportedModVal but the effect of this on existing code would have to be
 determined.)

 * The usages when we instantiate a signature are the (instantiated) usages
 of the original signature (in particular, this picks up the usages from
 instance lookup), plus a usage for each entity that we match against
 (because we must rematch if the type changes.)

 * Usages for signature merging are a little trickier. We want a usage for
 every entity that we end up merging in (so, we must record usages post
 thinning), BUT we must make sure the usage points at the identity module
 of the signature that originally provided it, not the semantic module
 (which will invariably point to the current module under compilation.)

 One more thing: when we instantiate a module on-the-fly, we need to
 account for how we instantiated it (to put it differently, the
 recompilation information we compute when we do on-the-fly should be the
 (morally) the same as what we would get if we actually compiled the
 modules in question. This is a bit troublesome since we don't have
 detailed information relating how a signature was instantiated and what we
 used (the on-the-fly instantiation process shortcuts this). The simplest
 thing is probably to just record the module hashes of each module that was
 used to instantiate an imported module (recursively); we might be able to
 do this even by just twiddling `mi_mod_hash` hash when we instantiate (the
 alternative is to switch to recording InstalledModule/InstalledUnitId only
 in hashes, and augmenting usage information to also carry along
 instantiations.)

 Another problem is that we record usages for Module (instantiated things),
 but hashes are actually on an InstalledModule basis.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/13493>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list