[GHC] #9805: Use TrieMaps to speed up type class instance lookup

GHC ghc-devs at haskell.org
Mon Nov 17 02:24:00 UTC 2014


#9805: Use TrieMaps to speed up type class instance lookup
-------------------------------------+-------------------------------------
       Reporter:  ezyang             |                   Owner:  ezyang
           Type:  task               |                  Status:  new
       Priority:  normal             |               Milestone:
      Component:  Compiler (Type     |                 Version:  7.9
  checker)                           |        Operating System:
       Keywords:                     |  Unknown/Multiple
   Architecture:  Unknown/Multiple   |         Type of failure:
     Difficulty:  Unknown            |  None/Unknown
     Blocked By:                     |               Test Case:
Related Tickets:                     |                Blocking:
                                     |  Differential Revisions:
-------------------------------------+-------------------------------------
 Currently, type class instance resolution is performed by doing a map
 lookup by type class, and then linearly matching against every instance.
 This is not great, and for a while, simonpj has been keen on using the
 TrieMap data structure in GHC, which has been previously used to implement
 a map from CoreExprs to various things (e.g. for CSE). What makes this a
 little tricky is that instance lookup isn't intended to be an exact match:
 we should unify so-called template variables and provide a substitution;
 furthermore, there may be multiple matches.

 After some prototyping, I think we should be able to make this work. The
 primary refactoring we have to do is *maintain the key associated with an
 entry in a TrieMap*. Unlike the current uses of TrieMaps, where it's
 acceptable to lose the underlying key associated with an entry in the
 TrieMap, we need to know the key when doing unification, because it may be
 reported in the substitution. There are a few knock-on effects of this:

   * We should add a method `foldWithKeyTM :: (Key m -> a -> b -> b) -> m a
 -> b -> b` to the `TrieMap` type class.
   * We need a variant of `UniqFM` which tracks the original key that was
 used. I propose we name it `UniqKM` (unique key map). A number of
 implementations of `TrieMap` need to be adjusted to use this instead of
 `UniqFM`.
   * Why can't we just represent keyed TrieMaps as `TypeMap (Type, a)`? It
 might be possible. An insurmountable difficulty would be if we need to
 know the partial key PRIOR to having finished traversing the TrieMap;
 however, for the parts of the unification algorithm I've implemented, this
 does not seem to be the case. The primary actual difficulty is that `Type`
 uses a named representation, whereas `TypeMap` keys are on-the-fly
 deBruijn numbered. There would at least be some annoying fiddliness.
   * Reversing the deBruijn numbered key into a `Type` is a bit goofy.
 Essentially, you need the reverse of the current `CmEnv`: a mapping from
 de Bruijn levels to the `Var` you've decided to allocate. (I've called
 this `CfEnv` in my prototype)
   * When we represent a TrieMap binder, we have to put the binder map on
 the OUTSIDE, as opposed to the inside as it is currently. Otherwise, we
 can't update the `CfEnv` with the new mapping before making the recursive
 call to fold-with-key.

 I'll attach the standalone Haskell file I used to prototype this, wherein
 I copy-pasted a lot of Haskell from GHC's source and implemented fuzzy map
 on a simplified version of `Type`.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/9805>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list