[GHC] #10680: Make Backpack order-independent (again)

GHC ghc-devs at haskell.org
Fri Jul 24 03:52:47 UTC 2015


#10680: Make Backpack order-independent (again)
-------------------------------------+-------------------------------------
              Reporter:  ezyang      |             Owner:  ezyang
                  Type:  task        |            Status:  new
              Priority:  normal      |         Milestone:
             Component:  Package     |           Version:  7.11
  system                             |
              Keywords:              |  Operating System:  Unknown/Multiple
          Architecture:              |   Type of failure:  None/Unknown
  Unknown/Multiple                   |
             Test Case:              |        Blocked By:
              Blocking:              |   Related Tickets:
Differential Revisions:              |
-------------------------------------+-------------------------------------
 When we moved to the new `bkp` file format, we also went back to the a
 format which is order-dependent: that is to say, the order in which you
 put the declarations matters.  So if you write:

 {{{
 unit p where
   module A where
     import B
   module B where
     ...
 }}}

 this fails to type-check, GHC complaining that `B` is not in scope.  I did
 this, in part because it's what the Backpack paper described, and because
 it was "simpler" to implement.

 I think we should move back to an order-independent scheme, for the
 following reasons:

 1. Haskell users are used to not needing pay particularly close attention
 to the ordering of their modules, and forcing people to linearize their
 module descriptions would be spectacularly disruptive with large amounts
 of modules.  So un-ordered modules are "more natural.

 2. Order-independence imposes some constraints on how expressive programs
 are (with order-dependent Backpack, you can do some pretty tricky things
 by ordering things certain ways); this could simplify some aspects of
 compiler implementation and make Backpack easier to explain.

 3. A particular case of (2): it seems a lot simpler UX-wise to let a user
 assume that if you import a module `M` in a unit, it doesn't matter where
 you import it: you always get the same set of identifiers brought into
 scope. Thus, the incremental results of signatures should not be visible,
 c.f. #10679

 The main idea is that only the surface-syntax is un-ordered: the internal
 representation of units is a DAG which we work out in an elaboration
 phase, not altogether unsimilar from what `GhcMake` computes.

 Here are the details:

 **The intermediate representation.** We translate into an intermediate
 representation which consists of a directed graph between modules,
 signatures and includes.  Edges in the graph indicate a "depends on"
 relation:

 1. `include p` depends on `include q` if, for some module name `H`, `p`
 requires `H` and `q` provides `H`.
 2. A module/signature `M` depends on `include p` if `M` imports a module
 provided or required by `p`.
 3. A module/signature `M` depends on a module/signature `S` if `M` imports
 `S`.
 4. An `include p` depends on a module `M` if `p` requires a module named
 `M`. (This rule is included for completeness; we are going to disallow it
 shortly.)

 We impose one restriction: a signature cannot depend on a home module.
 See below for how to eliminate this restriction.

 Rule (2) is worth remarking upon: if a module imports a signature, it
 depends-on every `include` which requires that signature, as well as the
 relevant home signature.  This could easily result in a cycle; see (2) for
 how to break these cycles.  The consequence of this, however, is that we
 can factor the graph to introduce the node for the "merge of signatures",
 which depends on each signature and include which requires it; we can use
 this opportunity to build and write out the merged interface file for the
 unit.

 **Elaboration.** Take a Backpack file, construct this graph, and topsort
 it into a DAG of SCCs. SCCs with a single node are compileable as before.
 SCCs with multiple nodes will have to be managed with some mutual
 recursion mechanism; see refinements for more thoughts on this.

 **Refinements:**

 1. **Can a signature depend on a (home) module?** Imports of this kind
 require a retypecheck loop.  Consider this situation:
 {{{
 unit p where
   signature H where
     data T
   module M where
     import H
     data S = S T
 unit q where
   include p
   module Q where
     import M
   signature H where
     import Q
     data T = T S
 }}}
    Here, signature H in q depends on Q.  When we typecheck `Q`, we bring
 `M.S` into the type environment with a `TyThing` that describes the
 constructor as accepting an abstract type `T`.  However, when we
 subsequently typecheck the local signature `H`, we must refine all
 `TyThing`s of `T` with the true description (e.g. constructor
 information).  So you'll need to retypecheck `Q` (and `M`) in order to
 make sure the `TyThing` is correct.

 2. **Can an include depend on a (home) module?** If the module has no
 (transitive) dependency on signatures, this is fine. However, it's easy to
 have a circular dependency.  Consider:
 {{{
 unit p where
   signature A
   signature B
   module M
 unit q where
   include p
   module B where
     import A
     ...
 }}}
    `B` depends on `p` for `p/A.hsig`; however, `p` depends on `B` because
 this module is filling a requirement. Fortunately, we can untangle this
 knot without any mutual recursion nonsense (and the attendant efficiency
 loss): `A` is just an export list, we can compute it from the abstractly
 type-checked version of `p` without instantiating `B`.

 3. **Can we deal with include-include dependency cycles?** Yes! Just use
 the Backpack paper's strategy for creating a recursive unit key and
 compile the two packages `hs-boot` style. But I'm not planning on
 implementing this yet.

 4. **Can we deal with signature-signature dependency cycles?** Ordered
 Backpack would have supported this:
 {{{
 unit a-sig where
   signature A where
     data T
 unit ab-sig where
   include a-sig
   signature B where
     import A
     data S = S T
   signature A where
     import B
     data T = T S
 }}}
    In our model, `ab-sig` has a cycle.  However, I believe any such cycle
 can be broken by creating sufficiently many units:
 {{{
 unit a-sig where
   signature B where
     data T
   signature A where
     data S = S T
 unit b-sig where
   signature A where
     data S
   signature B where
     data T = T S
 unit ab-sig where
   include a-sig
   include b-sig
 }}}
    In principle, GHC could automatically break import cycles by replacing
 an import with an import of a reduced signature that simply has abstract
 type definitions. (I'm not sure this is possible for all language
 features.) This technique would also work for normal modules, assuming
 that every function is explicitly annotated with a type.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/10680>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list