[GHC] #13059: High memory usage during compilation

GHC ghc-devs at haskell.org
Sun Feb 12 02:16:31 UTC 2017


#13059: High memory usage during compilation
-------------------------------------+-------------------------------------
        Reporter:  domenkozar        |                Owner:
            Type:  bug               |               Status:  new
        Priority:  highest           |            Milestone:  8.2.1
       Component:  Compiler          |              Version:  8.0.2-rc2
      Resolution:                    |             Keywords:  Generics
Operating System:  Unknown/Multiple  |         Architecture:  x86_64
 Type of failure:  Compile-time      |  (amd64)
  performance bug                    |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:  #5642, #11068     |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------
Changes (by RyanGlScott):

 * keywords:   => Generics
 * related:   => #5642, #11068


Comment:

 After staring at this some more, I think I have a more solid grasp of what
 is going on here.

 I stared at the code pre-54b887b5abf6ee723c6ac6aaa2d2f4c14cf74060 (which I
 will henceforth refer to as "8.0.1" for brevity), and noticed something
 interesting about the code which generates default method implementations
 for things that use `DefaultSignatures` (`GenericDM`):

 {{{#!hs
 tc_default sel_id (Just (dm_name, GenericDM {}))
   = do { meth_bind <- mkGenericDefMethBind clas inst_tys sel_id dm_name
        ; tcMethodBody clas tyvars dfun_ev_vars inst_tys
                               dfun_ev_binds is_derived hs_sig_fn prags
                               sel_id meth_bind inst_loc }
 }}}

 Compare this to the treatment for non-`DefaultSignatures`-using default
 method implementations (`VanillaDM`):

 {{{#!hs
 tc_default sel_id (Just (dm_name, VanillaDM)) -- A polymorphic default
 method
   = do {

          ...

        ; let dm_inline_prag = idInlinePragma dm_id
              rhs = HsWrap (mkWpEvVarApps [self_dict] <.> mkWpTyApps
 inst_tys) $
                    HsVar (noLoc dm_id)

              meth_bind = mkVarBind local_meth_id (L inst_loc rhs)
              meth_id1 = meth_id `setInlinePragma` dm_inline_prag
                     -- Copy the inline pragma (if any) from the default
                     -- method to this version. Note [INLINE and default
 methods]

              export = ABE { abe_wrap = idHsWrapper
                           , abe_poly = meth_id1
                           , abe_mono = local_meth_id
                           , abe_prags = mk_meth_spec_prags meth_id1
 spec_inst_prags [] }
              bind = AbsBinds { abs_tvs = tyvars, abs_ev_vars =
 dfun_ev_vars
                              , abs_exports = [export]
                              , abs_ev_binds = [EvBinds (unitBag
 self_ev_bind)]
                              , abs_binds    = unitBag meth_bind }

        ; return (meth_id1, L inst_loc bind, Nothing) }
 }}}

 One crucial difference between the two is that in the latter `VanillaDM`
 case, it explicitly copies any user-supplied `INLINE` pragmas, whereas in
 the former `GenericDM` case, it does not! (I've perused the definition of
 `tcMethodBody` to make sure of this.)

 Now after 54b887b5abf6ee723c6ac6aaa2d2f4c14cf74060 (which I will refer to
 as "8.0.2" from here on out), the treatment of generic and non-generic
 default method implementations were unified, and split out into the
 `mkDefMethBind` function (source
 [http://git.haskell.org/ghc.git/blob/d3ea38ef0299e9330a105fa59dda38f9ec0712c4:/compiler/typecheck/TcInstDcls.hs#l1538
 here]). It is quite similar to the treatment of `VanillaDM` before, and in
 particular, it also copies any user-supplied `INLINE` pragmas.

 What does this have to do with `store`? The code in `store` uses a design
 pattern that tends to result in huge amounts of memory consumption: it
 uses lots of `GHC.Generics` instances and marks pretty much everything
 with `INLINE`. Overly aggressive use of `GHC.Generics` has been known to
 lead to memory blowup before (see #5642 and #11068). The common patterns
 between those tickets are:

 1. Defining "generic classes" with more than one method
 2. Putting `INLINE` methods on all instances for `GHC.Generics` datatypes

 `store` does not exhibit the first issue, so I can state with reasonable
 certainty that this problem falls into the second camp.

 So why did this problem only arise in 8.0.2? In 8.0.1 and earlier, GHC
 would desugar `DefaultSignatures` method implementations like:

 {{{#!hs
 instance Store Foo where
   size = $gdm_size Foo $dStoreFoo
 }}}

 This //lacks// an `INLINE` pragma. But in 8.0.2, it's desugared as:

 {{{#!hs
 instance Store Foo where
   size = $gdm_size @Foo
   {-# INLINE size #-}
 }}}

 But `$gdm_size = genericSize`, and `genericSize` is the composition of
 tons of `INLINE`d `GHC.Generics` functionality. We know that
 `GHC.Generics` has a propensity for high memory usage, and adding this
 extra `INLINE` pragma is enough to make the difference between 1.6 GB and
 5.17 GB of memory when compiling the insanely high number of
 `GHC.Generics`-based instances in `store`. This claim is backed up by
 comment:17, where explicitly implementing the instances and adding
 `INLINE` pragmas resulted in the same high memory usage in both 8.0.1 and
 8.0.2.

 So in the end, this is really just another case of `GHC.Generics` of being
 (ab)used in a way that it doesn't handle well (at least, not at the
 moment). I didn't spend much time digging into the space profiles for
 these programs in 8.0.1 and 8.0.2, as both just stated that
 `SimplTopBinds` took the bulk of the time without going into more detail,
 which wasn't terribly helpful.

 But all is not lost for `store`, as this analysis reveals a workaround. If
 you remove just the right `INLINE` pragmas from `store`:

 {{{#!diff
 diff --git a/src/Data/Store/Impl.hs b/src/Data/Store/Impl.hs
 index 26ea82e..bbf62a5 100644
 --- a/src/Data/Store/Impl.hs
 +++ b/src/Data/Store/Impl.hs
 @@ -64,15 +64,15 @@ class Store a where

      default size :: (Generic a, GStoreSize (Rep a)) => Size a
      size = genericSize
 -    {-# INLINE size #-}
 +    -- {-# INLINE size #-}

      default poke :: (Generic a, GStorePoke (Rep a)) => a -> Poke ()
      poke = genericPoke
 -    {-# INLINE poke #-}
 +    -- {-# INLINE poke #-}

      default peek :: (Generic a , GStorePeek (Rep a)) => Peek a
      peek = genericPeek
 -    {-# INLINE peek #-}
 +    -- {-# INLINE peek #-}

  ------------------------------------------------------------------------
  -- Utilities for encoding / decoding strict ByteStrings
 }}}

 Then I can confirm that compiling `store` with GHC 8.0.1 and 8.0.2 both
 result in about 1.7 GB of maximum memory residency. And given the
 [https://github.com/bos/aeson/pull/335#issuecomment-172730336 benchmark
 results for a similar patch] that I made for `aeson`, I don't think this
 would result in much of a difference in performance, if any.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/13059#comment:20>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list