Mutually-recursive/cyclic module imports
Isaac Dupree
isaacdupree at charter.net
Sun Aug 17 19:23:41 EDT 2008
Isaac Dupree wrote:
> Duncan Coutts wrote:
>> [...]
>>
>> I'm not saying it's a problem with your proposal, I'd just like it to be
>> taken into account. For example do dependency chasers need to grok just
>> import lines and {-# SOURCE -#} pragmas or do they need to calculate
>> fixpoints.
Actually, good point, Duncan, that got me thinking about
what we need in order to obviously not to lose much/any of
the .hs-boot efficiency. (warning: another long post ahead,
although the latter half of it is just an example from GHC's
source) [and I re-read my post and wasn't sure about a few
things, but maybe better to get feedback first -- please
tell me if I'm being too verbose somewhere, too]
Let's look at the total imports of a .hs and its .hs-boot,
as they currently are for GHC. Either can be non-SOURCE
imports (let's call them NOSOURCE), SOURCE imports, or not
importing that.
.hs:NOSOURCE, .hs-boot:NOSOURCE : okay
.hs:NOSOURCE, .hs-boot:SOURCE : okay
.hs:NOSOURCE, .hs-boot:not-imported : okay
.hs:SOURCE, .hs-boot:NOSOURCE : bad, if the .hs needs
SOURCE, then probably so does the .hs-boot
.hs:SOURCE, .hs-boot:SOURCE : okay
.hs:SOURCE, .hs-boot:not-imported : okay
- the .hs-boot importing a module that the .hs doesn't is
invalid, or at least useless [actually, see later example --
there may be reasons for this, but in that case, it doesn't
hurt to also import the module in the .hs (assuming there's
no syntactic/maintenance burden), and it provides better
automatic error-checking to do so]
Given the limited amount of information a .hs-boot file (or
SOURCE-imported file, in my scheme) needs for being a
boot-file, there is no advantage to import the modules it
depends on as NOSOURCE. The compiler just has to be clever
enough to ignore imports of functions that it can't find out
the type of. Also, currently using SOURCE requires the
imported module to have a .hs-boot. But it should work fine
to look for a .hi and use that in the absence of .hi-boot,
because it has strictly a superset of the information (so
that my statement that "SOURCE is superior to NOSOURCE when
it works" can be truer, for the sake of demonstration).
[oops! I was wrong, it may need to NOSOURCE-import on
occasion to find out a function's type - more on that in a
later post?]
Now, since the .hs-boot SOURCE vs NOSOURCE has been
collapsed, I think we can move mostly-all .hs-boot info into
the .hs file. If the .hs-boot file had imported something,
the corresponding import in the .hs is imported with
{-#SOURCE_FOLLOW#-} (in addition to {-#SOURCE#-} or
{-#NOSOURCE#-}); otherwise it's imported with
{-#SOURCE_NOFOLLOW#-} (ditto). For demonstration, I'll
assume that all imports are annotated this way, with two
bits of information. Presumably all imports that aren't
part of an import loop are NOSOURCE (which includes all
cross-package imports).
Now let's look at the dependency chaser.
NOSOURCE imports must not form a loop. They form dependency
chains as normal.
SOURCE imports depend on either a .hi or a .hi-boot for the
imported module.
When a X.hi-boot is demanded:
only SOURCE_FOLLOW imports are dependency-chased from X.hs,
through any .hs modules that don't already have a .hi or
.hi-boot.
In the case where .hs-boots worked, this *can* avoid cycles.
If this SOURCE_FOLLOW dependency DAG doesn't have any
cycles, then it should be as simple as calling (the
fictional) `ghc -source X.hs` to produce X.hi. If there are
cycles, and it is sometimes necessary*, GHC needs to be
slightly smarter and be able to produce all the .hi-boot
files at once from any graph SCCs (loops) that prevent it
from being a DAG (e.g., `ghc -source X.hs Y.hs` to produce
X.hi-boot and Y.hi-boot). Note that it doesn't need to be
particularly smart here -- e.g., no type inference is done.
*necessary loops:
example 1, the data/declarations literally loop:
module X1 where
{ import Y1(Y); data X a = End a | Both Y Y; }
module Y1 where
{ import X1(X); data Y = Only (X (Maybe Y)); }
(or kind annotations could be required for these loops in
general, e.g. data X (a :: *) = ...)
[hmm, in this case actually all we need is the data
left-hand-side, so we could do this in two stages. But that
wouldn't work out so well if their RHSs contained
{-#UNPACK#-}!SomeNewtypeForInt where SomeNewtypeForInt was
from the other module. But that's an optimization that it
might be okay not to do, as long as it was consistently not
done both for .hi-boot and .hi/.o; and it could perhaps be
doable]
example 2, there are just too many back-and-forths:
module X2 where
{ import Y2(Yb); data Xa = Xa; data Xc = Xc Yb; }
module Y2 where
{ import X2(Xa,Xc); data Yb = Yb Xa; data Yd = Yd Xc; }
This second one "could" also be accomplished if multiple
different .hs-boots were allowed per .hs,
although it doesn't seem worth the annotation!! such as
using SOURCE_FOLLOW[0] or [1], [2]...
I'm not even going to try to write that! [oh wait,
SOURCE[0->1] = SOURCE, SOURCE[1->1] = SOURCE_FOLLOW,
SOURCE[1->null] = SOURCE_NOFOLLOW, maybe something can be
done like that, more complicated in one way but perhaps a
bit sounder in another]
Now, SOURCE_NOFOLLOW is a bit of a hack, for a couple reasons:
- instances (especially orphans, and especially overlapping
instances) may not always be imported when they should be.
- There may be some information that could be
SOURCE-imported from the module if all its imports were
SOURCE_FOLLOW, but not enough information was imported that
way due solely to SOURCE_NOFOLLOW. That's probably okay
though; after all, the presence/absence of explicit type
signatures should have the same effect. Any information
that the shallow -hi-boot-making search can't figure out,
just doesn't go into the .hi-boot (possibly leading to
erroring later... perhaps the .hi-boot could store info
saying which information existed but it couldn't figure out,
to enhance error messages if that info is ever demanded.)
Obviously, Template Haskell can only be run if
NOSOURCE-imported from another module.
The stupid dependency chaser (the most complicated thing it
does besides parsing import statements is computing graph
SCCs) will, of course, still find a few more changed
dependencies than really need to be recompiled; as always,
this is where GHC's fancier recompilation checking will come
into effect.
Some compilers might benefit from (require?) explicit import
or export lists in some places... also I wonder if perhaps
items in export lists should be markable as whether they're
exported to SOURCE-importers (although it doesn't seem
necessary)
Obviously, annotating every import with both [NO]SOURCE and
SOURCE_[NO]FOLLOW is unreasonable! So let's look at
inferring them.
Any import that's not explicitly annotated with [NO]SOURCE
can default to NOSOURCE if it's not part of an import-loop,
or SOURCE if it is. NOSOURCE is allowed as a pragma here as
well as SOURCE. That is, the dependency chaser assumes
NOSOURCE, and if it finds a loop of imports that aren't
explicitly SOURCE imports, it converts all that aren't
*explicitly* NOSOURCE into SOURCE imports (if they're all
explicitly NOSOURCE, it's an error).
Since SOURCE_NOFOLLOW can easily break things, it really
shouldn't be the default. (And there's never any need of it
for imports of modules that aren't part of the current
module cycle). However, we really don't want to have to
specify it on all imports within the loop -- .hs-boots
manage to only specify for modules that *are* needed. I
suggest that SOURCE_[NO]FOLLOW be allowed as a top-level
pragma that says all (following?) imports are annotated with
that it they're not explicitly annotated SOURCE_[NO]FOLLOW.
For example, let's take some random file from GHC's source
that has a [l]hs-boot file: compiler/deSugar/DsExpr
current lhs-boot:
\begin{code}
module DsExpr where
import HsSyn ( HsExpr, LHsExpr, HsLocalBinds )
import Var ( Id )
import DsMonad ( DsM )
import CoreSyn ( CoreExpr )
dsExpr :: HsExpr Id -> DsM CoreExpr
dsLExpr :: LHsExpr Id -> DsM CoreExpr
dsLocalBinds :: HsLocalBinds Id -> CoreExpr -> DsM CoreExpr
\end{code}
current lhs: lots of imports, it will become obvious
proposed new lhs, just like old lhs but with a few pragmas
inserted:
\begin{code}
-- ...
module DsExpr ( dsExpr, dsLExpr, dsLocalBinds, dsValBinds,
dsLit ) where
{-# SOURCE_NOFOLLOW #-}
#include "HsVersions.h"
import Match
import MatchLit
import DsBinds
import DsGRHSs
import DsListComp
import DsUtils
import DsArrows
import {-# SOURCE_FOLLOW #-} DsMonad
import Name
#ifdef GHCI
import PrelNames
-- Template Haskell stuff iff bootstrapped
import DsMeta
#endif
import {-# SOURCE_FOLLOW #-} HsSyn
import TcHsSyn
-- NB: The desugarer, which straddles the source and Core
worlds, sometimes
-- needs to see source types
import TcType
import Type
import {-# SOURCE_FOLLOW #-} CoreSyn
import CoreUtils
import DynFlags
import CostCentre
-- hmm, actually Var was not imported by the lhs,
-- only Id (which imports Var) ! It looks okay to
-- just annotate the Id import here:
import {-# SOURCE_FOLLOW #-} Id
-- Are there times where this would ever
-- be a terrible problem? Well, we could have
-- added a line
--import {-# SOURCE_FOLLOW #-} Var ( Id )
-- instead, which would not hurt much.
-- (if Var.Id were a different type than Id.Id,
-- compiling this DsExpr module would give a
-- simple ambiguity error, no risk of
-- hs vs. hs-boot inconsistency)
import PrelInfo
import DataCon
import TysWiredIn
import BasicTypes
import PrelNames
import SrcLoc
import Util
import Bag
import Outputable
import FastString
\end{code}
...
-Isaac
More information about the Haskell-prime
mailing list