[GHC] #13059: High memory usage during compilation
GHC
ghc-devs at haskell.org
Sun Feb 12 02:16:31 UTC 2017
#13059: High memory usage during compilation
-------------------------------------+-------------------------------------
Reporter: domenkozar | Owner:
Type: bug | Status: new
Priority: highest | Milestone: 8.2.1
Component: Compiler | Version: 8.0.2-rc2
Resolution: | Keywords: Generics
Operating System: Unknown/Multiple | Architecture: x86_64
Type of failure: Compile-time | (amd64)
performance bug | Test Case:
Blocked By: | Blocking:
Related Tickets: #5642, #11068 | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Changes (by RyanGlScott):
* keywords: => Generics
* related: => #5642, #11068
Comment:
After staring at this some more, I think I have a more solid grasp of what
is going on here.
I stared at the code pre-54b887b5abf6ee723c6ac6aaa2d2f4c14cf74060 (which I
will henceforth refer to as "8.0.1" for brevity), and noticed something
interesting about the code which generates default method implementations
for things that use `DefaultSignatures` (`GenericDM`):
{{{#!hs
tc_default sel_id (Just (dm_name, GenericDM {}))
= do { meth_bind <- mkGenericDefMethBind clas inst_tys sel_id dm_name
; tcMethodBody clas tyvars dfun_ev_vars inst_tys
dfun_ev_binds is_derived hs_sig_fn prags
sel_id meth_bind inst_loc }
}}}
Compare this to the treatment for non-`DefaultSignatures`-using default
method implementations (`VanillaDM`):
{{{#!hs
tc_default sel_id (Just (dm_name, VanillaDM)) -- A polymorphic default
method
= do {
...
; let dm_inline_prag = idInlinePragma dm_id
rhs = HsWrap (mkWpEvVarApps [self_dict] <.> mkWpTyApps
inst_tys) $
HsVar (noLoc dm_id)
meth_bind = mkVarBind local_meth_id (L inst_loc rhs)
meth_id1 = meth_id `setInlinePragma` dm_inline_prag
-- Copy the inline pragma (if any) from the default
-- method to this version. Note [INLINE and default
methods]
export = ABE { abe_wrap = idHsWrapper
, abe_poly = meth_id1
, abe_mono = local_meth_id
, abe_prags = mk_meth_spec_prags meth_id1
spec_inst_prags [] }
bind = AbsBinds { abs_tvs = tyvars, abs_ev_vars =
dfun_ev_vars
, abs_exports = [export]
, abs_ev_binds = [EvBinds (unitBag
self_ev_bind)]
, abs_binds = unitBag meth_bind }
; return (meth_id1, L inst_loc bind, Nothing) }
}}}
One crucial difference between the two is that in the latter `VanillaDM`
case, it explicitly copies any user-supplied `INLINE` pragmas, whereas in
the former `GenericDM` case, it does not! (I've perused the definition of
`tcMethodBody` to make sure of this.)
Now after 54b887b5abf6ee723c6ac6aaa2d2f4c14cf74060 (which I will refer to
as "8.0.2" from here on out), the treatment of generic and non-generic
default method implementations were unified, and split out into the
`mkDefMethBind` function (source
[http://git.haskell.org/ghc.git/blob/d3ea38ef0299e9330a105fa59dda38f9ec0712c4:/compiler/typecheck/TcInstDcls.hs#l1538
here]). It is quite similar to the treatment of `VanillaDM` before, and in
particular, it also copies any user-supplied `INLINE` pragmas.
What does this have to do with `store`? The code in `store` uses a design
pattern that tends to result in huge amounts of memory consumption: it
uses lots of `GHC.Generics` instances and marks pretty much everything
with `INLINE`. Overly aggressive use of `GHC.Generics` has been known to
lead to memory blowup before (see #5642 and #11068). The common patterns
between those tickets are:
1. Defining "generic classes" with more than one method
2. Putting `INLINE` methods on all instances for `GHC.Generics` datatypes
`store` does not exhibit the first issue, so I can state with reasonable
certainty that this problem falls into the second camp.
So why did this problem only arise in 8.0.2? In 8.0.1 and earlier, GHC
would desugar `DefaultSignatures` method implementations like:
{{{#!hs
instance Store Foo where
size = $gdm_size Foo $dStoreFoo
}}}
This //lacks// an `INLINE` pragma. But in 8.0.2, it's desugared as:
{{{#!hs
instance Store Foo where
size = $gdm_size @Foo
{-# INLINE size #-}
}}}
But `$gdm_size = genericSize`, and `genericSize` is the composition of
tons of `INLINE`d `GHC.Generics` functionality. We know that
`GHC.Generics` has a propensity for high memory usage, and adding this
extra `INLINE` pragma is enough to make the difference between 1.6 GB and
5.17 GB of memory when compiling the insanely high number of
`GHC.Generics`-based instances in `store`. This claim is backed up by
comment:17, where explicitly implementing the instances and adding
`INLINE` pragmas resulted in the same high memory usage in both 8.0.1 and
8.0.2.
So in the end, this is really just another case of `GHC.Generics` of being
(ab)used in a way that it doesn't handle well (at least, not at the
moment). I didn't spend much time digging into the space profiles for
these programs in 8.0.1 and 8.0.2, as both just stated that
`SimplTopBinds` took the bulk of the time without going into more detail,
which wasn't terribly helpful.
But all is not lost for `store`, as this analysis reveals a workaround. If
you remove just the right `INLINE` pragmas from `store`:
{{{#!diff
diff --git a/src/Data/Store/Impl.hs b/src/Data/Store/Impl.hs
index 26ea82e..bbf62a5 100644
--- a/src/Data/Store/Impl.hs
+++ b/src/Data/Store/Impl.hs
@@ -64,15 +64,15 @@ class Store a where
default size :: (Generic a, GStoreSize (Rep a)) => Size a
size = genericSize
- {-# INLINE size #-}
+ -- {-# INLINE size #-}
default poke :: (Generic a, GStorePoke (Rep a)) => a -> Poke ()
poke = genericPoke
- {-# INLINE poke #-}
+ -- {-# INLINE poke #-}
default peek :: (Generic a , GStorePeek (Rep a)) => Peek a
peek = genericPeek
- {-# INLINE peek #-}
+ -- {-# INLINE peek #-}
------------------------------------------------------------------------
-- Utilities for encoding / decoding strict ByteStrings
}}}
Then I can confirm that compiling `store` with GHC 8.0.1 and 8.0.2 both
result in about 1.7 GB of maximum memory residency. And given the
[https://github.com/bos/aeson/pull/335#issuecomment-172730336 benchmark
results for a similar patch] that I made for `aeson`, I don't think this
would result in much of a difference in performance, if any.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/13059#comment:20>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list