[GHC] #9960: Performance problem with TrieMap

GHC ghc-devs at haskell.org
Mon Jan 5 22:54:09 UTC 2015


#9960: Performance problem with TrieMap
-------------------------------------+-------------------------------------
              Reporter:  simonpj     |             Owner:
                  Type:  bug         |            Status:  new
              Priority:  normal      |         Milestone:
             Component:  Compiler    |           Version:  7.8.4
              Keywords:              |  Operating System:  Unknown/Multiple
          Architecture:              |   Type of failure:  None/Unknown
  Unknown/Multiple                   |        Blocked By:
             Test Case:              |   Related Tickets:
              Blocking:              |
Differential Revisions:              |
-------------------------------------+-------------------------------------
 Looking at #9805 reminded me.  When investigating the performance of #9872
 I found a major performance hole in `TrieMap`.

 Suppose we have a handful H of entries in a `TrieMap`, each with a very
 large key, size K.  If you fold over such a `TrieMap` you'd expect time
 O(H). That would certainly be true of an association list!  But with
 `TrieMap` we actually have to navigate down a long singleton structure to
 get to the elements, so it takes time O(K*H).

 This is pretty bad.  In test `T9872` I found that 90% of compile time was
 spent in `TrieMap` as we repeatedly built, folded, and then rebuilt
 `TrieMap`s.  I fixed that by keeping smaller `TrieMap`s, but the
 underlying problem remains.

 The point of a `TrieMap` is that you need to navigate to the point where
 only one key remains, and then things should be fast.  So I think that
 `TypeMap`, say, should look like
 {{{
 data TypeMap a
   = EmptyTM
   | SingletonTM Type a
   | TM { tm_var   :: VarMap a
        , tm_app    :: TypeMap (TypeMap a)
        , tm_fun    :: TypeMap (TypeMap a)
        , tm_tc_app :: NameEnv (ListMap TypeMap a)
        , tm_forall :: TypeMap (BndrMap a)
        , tm_tylit  :: TyLitMap a
        }
 }}}
 The `SingletonTM` means that, once you are down to a single (key,value)
 pair, you stop and
 just use `SingletonTM`.

 Tiresomely, because of the deBruijn thing, it'd need to be
 {{{
 ...
   | SingletonTM (CMEnv, Type) a
 ...
 }}}
 and we'd need an auxiliary function for equality:
 {{{
 eqTypeModuloDeBruijn :: (CMEnv, Type) -> (CEnv, Type) -> Bool
 }}}
 But that's not hard. Very similar to the way `RnEnv2` works.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/9960>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list