How to distinguish local ids from imported ones in Stg? Confused about isLocalId/isLocalVar

Ömer Sinan Ağacan omeragacan at gmail.com
Tue Nov 20 08:47:57 UTC 2018


Hi,

I just found out that semantics of isLocalId/isLocalVar change during
compilation. I realized this the hard way (after some debugging) but later
realized that this is documented in this note: (the last line)

    Note [GlobalId/LocalId]
    ~~~~~~~~~~~~~~~~~~~~~~~
    A GlobalId is
      * always a constant (top-level)
      * imported, or data constructor, or primop, or record selector
      * has a Unique that is globally unique across the whole
        GHC invocation (a single invocation may compile multiple modules)
      * never treated as a candidate by the free-variable finder;
            it's a constant!

    A LocalId is
      * bound within an expression (lambda, case, local let(rec))
      * or defined at top level in the module being compiled
      * always treated as a candidate by the free-variable finder

    After CoreTidy, top-level LocalIds are turned into GlobalIds

So after after simplification we can't distinguish a local id from an imported
one.

Apparently I'm not the only one who was confused by this. In StgLint we check
in-scope variables with this:

    checkInScope :: Id -> LintM ()
    checkInScope id = LintM $ \_lf loc scope errs
     -> if isLocalId id && not (id `elemVarSet` scope) then
            ((), addErr errs (hsep [ppr id, dcolon, ppr (idType id),
                                    text "is out of scope"]) loc)
        else
            ((), errs)

Note that isLocalId here returns false for local but top-level bindings. Because
of this if I drop some top-level bindings in the module I don't get a lint error
even though some ids become unbound.

I need to distinguish a top-level bound id from an imported id for two things:

- I want to make sure, in StgLint, that bindings in the Stg program are in
  dependency order (uses come after definitions). For this I need to treat
  imported ids as already bound, but for top-level bound ids I need to check if
  I already saw the definition.

- I want to run an analysis to map top-level bindings to whether they can
  contain CAF refs. For this for a top-level bound id I should check the
  environment, but for imported ids I should check idInfo.

The second analysis depends on the first property for efficiency. I could assume
that the first property holds and then always check the environment first in the
analysis, and treat ids that are not in the environment as imported, but that
seems fragile.

I'm wondering why we have to turn LocalIds into GlobalIds after simplification.
The note doesn't explain the reasoning. Would it be possible to preserve idScope
of ids during the whole compilation?

Thanks,

Ömer


More information about the ghc-devs mailing list