How to distinguish local ids from imported ones in Stg? Confused about isLocalId/isLocalVar
Ömer Sinan Ağacan
omeragacan at gmail.com
Tue Nov 20 08:47:57 UTC 2018
Hi,
I just found out that semantics of isLocalId/isLocalVar change during
compilation. I realized this the hard way (after some debugging) but later
realized that this is documented in this note: (the last line)
Note [GlobalId/LocalId]
~~~~~~~~~~~~~~~~~~~~~~~
A GlobalId is
* always a constant (top-level)
* imported, or data constructor, or primop, or record selector
* has a Unique that is globally unique across the whole
GHC invocation (a single invocation may compile multiple modules)
* never treated as a candidate by the free-variable finder;
it's a constant!
A LocalId is
* bound within an expression (lambda, case, local let(rec))
* or defined at top level in the module being compiled
* always treated as a candidate by the free-variable finder
After CoreTidy, top-level LocalIds are turned into GlobalIds
So after after simplification we can't distinguish a local id from an imported
one.
Apparently I'm not the only one who was confused by this. In StgLint we check
in-scope variables with this:
checkInScope :: Id -> LintM ()
checkInScope id = LintM $ \_lf loc scope errs
-> if isLocalId id && not (id `elemVarSet` scope) then
((), addErr errs (hsep [ppr id, dcolon, ppr (idType id),
text "is out of scope"]) loc)
else
((), errs)
Note that isLocalId here returns false for local but top-level bindings. Because
of this if I drop some top-level bindings in the module I don't get a lint error
even though some ids become unbound.
I need to distinguish a top-level bound id from an imported id for two things:
- I want to make sure, in StgLint, that bindings in the Stg program are in
dependency order (uses come after definitions). For this I need to treat
imported ids as already bound, but for top-level bound ids I need to check if
I already saw the definition.
- I want to run an analysis to map top-level bindings to whether they can
contain CAF refs. For this for a top-level bound id I should check the
environment, but for imported ids I should check idInfo.
The second analysis depends on the first property for efficiency. I could assume
that the first property holds and then always check the environment first in the
analysis, and treat ids that are not in the environment as imported, but that
seems fragile.
I'm wondering why we have to turn LocalIds into GlobalIds after simplification.
The note doesn't explain the reasoning. Would it be possible to preserve idScope
of ids during the whole compilation?
Thanks,
Ömer
More information about the ghc-devs
mailing list