[GHC] #10370: Compile time regression in OpenGLRaw

GHC ghc-devs at haskell.org
Fri May 22 14:14:32 UTC 2015


#10370: Compile time regression in OpenGLRaw
-------------------------------------+-------------------------------------
        Reporter:  michalt           |                   Owner:
            Type:  bug               |                  Status:  new
        Priority:  normal            |               Milestone:
       Component:  Compiler          |                 Version:  7.10.1
      Resolution:                    |                Keywords:
Operating System:  Unknown/Multiple  |            Architecture:
 Type of failure:  Compile-time      |  Unknown/Multiple
  performance bug                    |               Test Case:
      Blocked By:                    |                Blocking:
 Related Tickets:                    |  Differential Revisions:
-------------------------------------+-------------------------------------

Comment (by Simon Peyton Jones <simonpj@…>):

 In [changeset:"45d9a15c4b85a2ed89579106bdafd84accf2cb39/ghc"]:
 {{{
 #!CommitTicketReference repository="ghc"
 revision="45d9a15c4b85a2ed89579106bdafd84accf2cb39"
 Fix a huge space leak in the mighty Simplifier

 This long-standing, terrible, adn somewhat subtle bug was exposed
 by Trac #10370, thanks to Reid Barton's brilliant test case (comment:3).

 The effect is large on the Trac #10370 test.
 Here is what the profile report says:

 Before:
  total time  =       24.35 secs   (24353 ticks @ 1000 us, 1 processor)
  total alloc = 11,864,360,816 bytes  (excludes profiling overheads)

 After:
  total time  =       21.16 secs   (21160 ticks @ 1000 us, 1 processor)
  total alloc = 7,947,141,136 bytes  (excludes profiling overheads)

 The /combined/ effect of the tidyOccName fix, plus this one, is dramtic
 for Trac #10370.  Here is what +RTS -s says:

 Before:
   15,490,210,952 bytes allocated in the heap
    1,783,919,456 bytes maximum residency (20 sample(s))

   MUT     time   30.117s  ( 31.383s elapsed)
   GC      time   90.103s  ( 90.107s elapsed)
   Total   time  120.843s  (122.065s elapsed)

 After:
    7,928,671,936 bytes allocated in the heap
       52,914,832 bytes maximum residency (25 sample(s))

   MUT     time   13.912s  ( 15.110s elapsed)
   GC      time    6.809s  (  6.808s elapsed)
   Total   time   20.789s  ( 21.954s elapsed)

 - Heap allocation halved
 - Residency cut by a factor of more than 30.
 - ELapsed time cut by a factor of 6

 Not bad!

 The details
 ~~~~~~~~~~~
 The culprit was SimplEnv.mkCoreSubst, which used mapVarEnv to do some
 impedence-matching from the substitituion used by the simplifier to
 the one used by CoreSubst.  But the impedence-mactching was recursive!

   mk_subst tv_env cv_env id_env
     = CoreSubst.mkSubst in_scope tv_env cv_env (mapVarEnv fiddle id_env)

   fiddle (DoneEx e)          = e
   fiddle (DoneId v)          = Var v
   fiddle (ContEx tv cv id e) = CoreSubst.substExpr (mk_subst tv cv id) e

 Inside fiddle, in the ContEx case, we may do another whole level of
 fiddle.  And so on.  Moreover, UniqFM (which is built on Data.IntMap) is
 strict, so the fiddling is done eagerly.  I didn't wok through all the
 details but the result is a gargatuan blow-up of entirely unnecessary
 work.

 Laziness would make this go away, I think, but I don't want to mess
 with IntMap.  And in any case, the impedence matching is a royal pain.

 In the end I simply ceased trying to use CoreSubst.substExpr in the
 simplifier, and instead just use simplExpr.  That does mean bit of
 duplication; e.g.  new code for simplRules.  But it's not a big deal
 and it's far more direct and easy to reason about.

 A bit of knock-on refactoring:

  * Data type ArgSummary moves to CoreUnfold.

  * interestingArg moves from CoreUnfold to SimplUtils, and gets a
    SimplEnv argument which can be used when we encounter a variable.

  * simplLamBndrs, addBndrRules move from SimplEnv to Simplify
    (because they now calls simplUnfolding, simplRules resp)

  * SimplUtils.substExpr, substUnfolding, mkCoreSubst die completely

  * In Simplify some several functions that were previously pure
    substitution-based functions are now monadic:
      - addBndrRules, simplRule
      - addCoerce, add_coerce in simplCast

  * In case 2c of Simplify.rebuildCase, there was a pretty disgusting
    expression-substitution taking place for 'rhs'; and we really don't
    want to make that monadic becuase 'rhs' can be big.
    Solution: reduce the arity of the rules for seq.
    See Note [User-defined RULES for seq] in MkId.
 }}}

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/10370#comment:12>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list