Patch/feature proposal: "Source plugins"

Edsko de Vries edskodevries at gmail.com
Wed Jun 5 13:51:55 CEST 2013


Sorry for the earlier mishap, here's the full email.

Hi all,

The plugin mechanism gives access to the program in Core; this suffices for
many but not quite all purposes. Tools that need access to the original AST
can call typecheckModule directly, but of course this requires using the
GHC API directly. Moreover, even when using the GHC API directly anyway (as
in my case), it means that tools cannot take advantage of ghc's
infrastructure for dependency tracking, recompiling only changed modules,
etc.

Hence it would be useful to have "source plugins", which can be used both
externally and when using ghc API (in the latter case I guess "hooks" would
be the more appropriate terminology). Currently "core plugins" are recorded
as part of DynFlags as

    pluginModNames        :: [ModuleName],
    pluginModNameOpts     :: [(ModuleName,String)],

This makes sense when thinking of plugins only as an external mechanism,
but is less convenient when using them as internal hooks, too. In my draft
patch I introduce a new type "HscPlugin" (described shortly) and added

    sourcePlugins         :: [HscPlugin],

to DynFlags. HscPlugin is a record of a pair of functions; having the
actual record here rather than  a module name means that these functions
can have a non-empty closure, which is obviously convenient when using this
as a hook rather than an external plugin.

In my current version HscPlugin looks like

    data HscPlugin = HscPlugin {
        runHscPlugin :: forall m. MonadIO m
                     => DynFlags
                     -> TcGblEnv
                     -> m TcGblEnv

      , runHscQQ     :: forall m. MonadIO m
                     => Env TcGblEnv TcLclEnv
                     -> HsQuasiQuote Name
                     -> m (HsQuasiQuote Name)
      }

runHscPlugin is the main function; it gets passed the TcGblEnv (which
contains the type checked AST as its tcd_binds field) and gets a change to
return it modified (we don't currently take advantage of that; I did that
only to be in line with "core plugins").

Unfortunately, the typechecked AST is only a subset of the renamed AST (see
http://www.haskell.org/pipermail/ghc-devs/2013-February/000540.html). The
TcGblEnv contains a  tcg_rn_decls field, which is a reference to the full
renamed (as opposed to typechecked) AST, but by default this is not
initialized: the typechecker only optionally retains the renamed AST, and
this is hardcoded to by False. In my current patch I have changed this so
that it's hard coded to be True; ideally this should become an option in
DynFlags (more ideal still would be if the type checked AST would not lose
any information).

Unfortunately, even the renamer loses information: quasi-quotes get
expanded during renaming and all evidence of that there was ever a
quasi-quote there has disappeared when the renamer returns. For this
reason, the HscPlugin type that I'm using at the moment also has a hook for
quasi-quotes.

So what I have currently done is:

   1. Introduced the above HscPlugin type and added a corresponding field
   to DynFlags
   2. Call runHscQQ in the renamer whenever a quasi-quote gets expanded.
   3. Make sure that the typechecker passes the result of the renamer
   through.
   4. Call runHscPlugin on the result of the typechecker.

In my client code I then combine the information obtained from these three
sources (2, 3, 4).

The path of least resistance for me currently to submit this as a patch for
ghc therefore be to submit a patch that does precisely what I described
above, mutatis mutandis based on your feedback, except that I would add an
option to add to DynFlags that would tell the type checker whether or not
to pass the result of the renamer through, rather than hardcoding it.

It is a little bit messy mostly because parts of the AST get lost along the
way: quasi-quotes in the renamer, data type declarations and other things
during type checking. A more ideal way, but also more time consuming, would
be to change this so that the renamer leaves evidence of the quasi-quotes
in the tree, and the type checker returns the entire tree type checked,
rather than just a subset. I think that ultimately this is the better
approach, at least for our purposes -- I'm not sure about other tools, but
since this would be a larger change that affects larger parts of the ghc
pipeline I'm not sure that I'll be able to do it.

Any and all feedback on the above would be appreciated!

Edsko
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20130605/55b5bd0a/attachment-0001.htm>


More information about the ghc-devs mailing list