[GHC] #9805: Use TrieMaps to speed up type class instance lookup
GHC
ghc-devs at haskell.org
Mon Nov 17 02:24:00 UTC 2014
#9805: Use TrieMaps to speed up type class instance lookup
-------------------------------------+-------------------------------------
Reporter: ezyang | Owner: ezyang
Type: task | Status: new
Priority: normal | Milestone:
Component: Compiler (Type | Version: 7.9
checker) | Operating System:
Keywords: | Unknown/Multiple
Architecture: Unknown/Multiple | Type of failure:
Difficulty: Unknown | None/Unknown
Blocked By: | Test Case:
Related Tickets: | Blocking:
| Differential Revisions:
-------------------------------------+-------------------------------------
Currently, type class instance resolution is performed by doing a map
lookup by type class, and then linearly matching against every instance.
This is not great, and for a while, simonpj has been keen on using the
TrieMap data structure in GHC, which has been previously used to implement
a map from CoreExprs to various things (e.g. for CSE). What makes this a
little tricky is that instance lookup isn't intended to be an exact match:
we should unify so-called template variables and provide a substitution;
furthermore, there may be multiple matches.
After some prototyping, I think we should be able to make this work. The
primary refactoring we have to do is *maintain the key associated with an
entry in a TrieMap*. Unlike the current uses of TrieMaps, where it's
acceptable to lose the underlying key associated with an entry in the
TrieMap, we need to know the key when doing unification, because it may be
reported in the substitution. There are a few knock-on effects of this:
* We should add a method `foldWithKeyTM :: (Key m -> a -> b -> b) -> m a
-> b -> b` to the `TrieMap` type class.
* We need a variant of `UniqFM` which tracks the original key that was
used. I propose we name it `UniqKM` (unique key map). A number of
implementations of `TrieMap` need to be adjusted to use this instead of
`UniqFM`.
* Why can't we just represent keyed TrieMaps as `TypeMap (Type, a)`? It
might be possible. An insurmountable difficulty would be if we need to
know the partial key PRIOR to having finished traversing the TrieMap;
however, for the parts of the unification algorithm I've implemented, this
does not seem to be the case. The primary actual difficulty is that `Type`
uses a named representation, whereas `TypeMap` keys are on-the-fly
deBruijn numbered. There would at least be some annoying fiddliness.
* Reversing the deBruijn numbered key into a `Type` is a bit goofy.
Essentially, you need the reverse of the current `CmEnv`: a mapping from
de Bruijn levels to the `Var` you've decided to allocate. (I've called
this `CfEnv` in my prototype)
* When we represent a TrieMap binder, we have to put the binder map on
the OUTSIDE, as opposed to the inside as it is currently. Otherwise, we
can't update the `CfEnv` with the new mapping before making the recursive
call to fold-with-key.
I'll attach the standalone Haskell file I used to prototype this, wherein
I copy-pasted a lot of Haskell from GHC's source and implemented fuzzy map
on a simplified version of `Type`.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/9805>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list