[commit: ghc] master: Don't eta-expand PAPs (fixes Trac #9020) (79e46ae)

git at git.haskell.org git at git.haskell.org
Thu Apr 24 07:44:00 UTC 2014


Repository : ssh://git@git.haskell.org/ghc

On branch  : master
Link       : http://ghc.haskell.org/trac/ghc/changeset/79e46aea1643b4dfdc7c846bbefe06b83b535efd/ghc

>---------------------------------------------------------------

commit 79e46aea1643b4dfdc7c846bbefe06b83b535efd
Author: Simon Peyton Jones <simonpj at microsoft.com>
Date:   Thu Apr 24 08:43:07 2014 +0100

    Don't eta-expand PAPs (fixes Trac #9020)
    
    See Note [Do not eta-expand PAPs] in SimplUtils.  This has a tremendously
    good effect on compile times for some simple benchmarks.
    
    The test is now where it belongs, in perf/compiler/T9020 (instead of simpl015).
    
    I did a nofib run and got essentially zero change except for cacheprof which
    gets 4% more allocation.  I investigated.  Turns out that we have
    
        instance PP Reg where
           pp ppm ST_0 = "%st"
           pp ppm ST_1 = "%st(1)"
           pp ppm ST_2 = "%st(2)"
           pp ppm ST_3 = "%st(3)"
           pp ppm ST_4 = "%st(4)"
           pp ppm ST_5 = "%st(5)"
           pp ppm ST_6 = "%st(6)"
           pp ppm ST_7 = "%st(7)"
           pp ppm r    = "%" ++ map toLower (show r)
    
    That (map toLower (show r) does a lot of map/toLowers.  But if we inline show
    we get something like
    
           pp ppm ST_0 = "%st"
           pp ppm ST_1 = "%st(1)"
           pp ppm ST_2 = "%st(2)"
           pp ppm ST_3 = "%st(3)"
           pp ppm ST_4 = "%st(4)"
           pp ppm ST_5 = "%st(5)"
           pp ppm ST_6 = "%st(6)"
           pp ppm ST_7 = "%st(7)"
           pp ppm EAX  = map toLower (show EAX)
           pp ppm EBX  = map toLower (show EBX)
           ...etc...
    
    and all those map/toLower calls can now be floated to top level.
    This gives a 4% decrease in allocation.  But it depends on inlining
    a pretty big 'show' function.
    
    With this new patch we get slightly better eta-expansion, which makes
    a function look slightly bigger, which just stops it being inlined.
    The previous behaviour was luck, so I'm not going to worry about
    losing it.
    
    I've added some notes to nofib/Simon-nofib-notes


>---------------------------------------------------------------

79e46aea1643b4dfdc7c846bbefe06b83b535efd
 compiler/coreSyn/CoreArity.lhs                     |    3 +-
 compiler/simplCore/SimplUtils.lhs                  |   44 +++++++++++++++-----
 .../simpl015.hs => perf/compiler/T9020.hs}         |    0
 testsuite/tests/perf/compiler/all.T                |    8 ++++
 .../should_compile => perf/compiler}/simpl015.hs   |    0
 testsuite/tests/simplCore/should_compile/all.T     |    1 -
 6 files changed, 43 insertions(+), 13 deletions(-)

diff --git a/compiler/coreSyn/CoreArity.lhs b/compiler/coreSyn/CoreArity.lhs
index 12d4274..ca7216f 100644
--- a/compiler/coreSyn/CoreArity.lhs
+++ b/compiler/coreSyn/CoreArity.lhs
@@ -73,7 +73,8 @@ should have arity 3, regardless of f's arity.
 
 \begin{code}
 manifestArity :: CoreExpr -> Arity
--- ^ manifestArity sees how many leading value lambdas there are
+-- ^ manifestArity sees how many leading value lambdas there are,
+--   after looking through casts
 manifestArity (Lam v e) | isId v    	= 1 + manifestArity e
 			| otherwise 	= manifestArity e
 manifestArity (Tick t e) | not (tickishIsCode t) =  manifestArity e
diff --git a/compiler/simplCore/SimplUtils.lhs b/compiler/simplCore/SimplUtils.lhs
index bde7b6b..a3042a7 100644
--- a/compiler/simplCore/SimplUtils.lhs
+++ b/compiler/simplCore/SimplUtils.lhs
@@ -1190,15 +1190,14 @@ because the latter is not well-kinded.
 \begin{code}
 tryEtaExpandRhs :: SimplEnv -> OutId -> OutExpr -> SimplM (Arity, OutExpr)
 -- See Note [Eta-expanding at let bindings]
--- and Note [Eta expansion to manifest arity]
 tryEtaExpandRhs env bndr rhs
   = do { dflags <- getDynFlags
        ; (new_arity, new_rhs) <- try_expand dflags
 
-       ; WARN( new_arity < old_arity,
-               (ptext (sLit "Arity decrease:") <+> (ppr bndr <+> ppr old_arity
-                <+> ppr new_arity) $$ ppr new_rhs) )
-                        -- Note [Arity decrease]
+       ; WARN( new_arity < old_id_arity,
+               (ptext (sLit "Arity decrease:") <+> (ppr bndr <+> ppr old_id_arity
+                <+> ppr old_arity <+> ppr new_arity) $$ ppr new_rhs) )
+                        -- Note [Arity decrease] in Simplify
          return (new_arity, new_rhs) }
   where
     try_expand dflags
@@ -1209,14 +1208,14 @@ tryEtaExpandRhs env bndr rhs
       , let new_arity1 = findRhsArity dflags bndr rhs old_arity
             new_arity2 = idCallArity bndr
             new_arity  = max new_arity1 new_arity2
-      , new_arity > manifest_arity      -- And the curent manifest arity isn't enough
+      , new_arity > old_arity      -- And the curent manifest arity isn't enough
       = do { tick (EtaExpansion bndr)
            ; return (new_arity, etaExpand new_arity rhs) }
       | otherwise
-      = return (manifest_arity, rhs)
+      = return (old_arity, rhs)
 
-    manifest_arity = manifestArity rhs
-    old_arity  = idArity bndr
+    old_arity    = exprArity rhs -- See Note [Do not expand eta-expand PAPs]
+    old_id_arity = idArity bndr
 \end{code}
 
 Note [Eta-expanding at let bindings]
@@ -1225,7 +1224,7 @@ We now eta expand at let-bindings, which is where the payoff comes.
 The most significant thing is that we can do a simple arity analysis
 (in CoreArity.findRhsArity), which we can't do for free-floating lambdas
 
-One useful consequence is this example:
+One useful consequence of not eta-expanding lambdas is this example:
    genMap :: C a => ...
    {-# INLINE genMap #-}
    genMap f xs = ...
@@ -1235,7 +1234,7 @@ One useful consequence is this example:
    myMap = genMap
 
 Notice that 'genMap' should only inline if applied to two arguments.
-In the InlineRule for myMap we'll have the unfolding
+In the stable unfolding for myMap we'll have the unfolding
     (\d -> genMap Int (..d..))
 We do not want to eta-expand to
     (\d f xs -> genMap Int (..d..) f xs)
@@ -1243,6 +1242,29 @@ because then 'genMap' will inline, and it really shouldn't: at least
 as far as the programmer is concerned, it's not applied to two
 arguments!
 
+Note [Do not eta-expand PAPs]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+We used to have old_arity = manifestArity rhs, which meant that we
+would eta-expand even PAPs.  But this gives no particular advantage,
+and can lead to a massive blow-up in code size, exhibited by Trac #9020.  
+Suppose we have a PAP
+    foo :: IO ()
+    foo = returnIO ()
+Then we can eta-expand do
+    foo = (\eta. (returnIO () |> sym g) eta) |> g
+where
+    g :: IO () ~ State# RealWorld -> (# State# RealWorld, () #)
+
+But there is really no point in doing this, and it generates masses of
+coercions and whatnot that eventually disappear again. For T9020, GHC
+allocated 6.6G beore, and 0.8G afterwards; and residency dropped from
+1.8G to 45M.
+
+But note that this won't eta-expand, say
+  f = \g -> map g
+Does it matter not eta-expanding such functions?  I'm not sure.  Perhaps
+strictness analysis will have less to bite on?
+
 
 %************************************************************************
 %*                                                                      *
diff --git a/testsuite/tests/simplCore/should_compile/simpl015.hs b/testsuite/tests/perf/compiler/T9020.hs
similarity index 100%
copy from testsuite/tests/simplCore/should_compile/simpl015.hs
copy to testsuite/tests/perf/compiler/T9020.hs
diff --git a/testsuite/tests/perf/compiler/all.T b/testsuite/tests/perf/compiler/all.T
index 2f4151f..2bff1c7 100644
--- a/testsuite/tests/perf/compiler/all.T
+++ b/testsuite/tests/perf/compiler/all.T
@@ -407,3 +407,11 @@ test('T6048',
              # 05/03/2014 110646312 amd64/Linux Call Arity became more elaborate
       ],
       compile,[''])
+
+test('T9020',
+     [ only_ways(['optasm']),
+      compiler_stats_num_field('bytes allocated',
+          [(wordsize(32),  40000000, 10),
+           (wordsize(64), 795469104, 10)])
+      ],
+      compile,[''])
diff --git a/testsuite/tests/simplCore/should_compile/simpl015.hs b/testsuite/tests/perf/compiler/simpl015.hs
similarity index 100%
rename from testsuite/tests/simplCore/should_compile/simpl015.hs
rename to testsuite/tests/perf/compiler/simpl015.hs
diff --git a/testsuite/tests/simplCore/should_compile/all.T b/testsuite/tests/simplCore/should_compile/all.T
index 7239ffc..616b6cc 100644
--- a/testsuite/tests/simplCore/should_compile/all.T
+++ b/testsuite/tests/simplCore/should_compile/all.T
@@ -15,7 +15,6 @@ test('simpl011', normal, compile, [''])
 test('simpl012', normal, compile, [''])
 test('simpl013', normal, compile, [''])
 test('simpl014', normal, compile, [''])
-test('simpl015', only_ways(['optasm']), compile, [''])
 test('simpl016', normal, compile, ['-dsuppress-uniques'])
 test('simpl017', normal, compile_fail, [''])
 test('simpl018', normal, compile, [''])



More information about the ghc-commits mailing list