[commit: ghc] master: Fix a small Float-Out bug (ff23978)

Tue May 2 09:14:31 UTC 2017

Repository : ssh://git@git.haskell.org/ghc

On branch  : master
Link       : http://ghc.haskell.org/trac/ghc/changeset/ff239787f7170a93f1015bd0f5582772b7b87f0a/ghc

>---------------------------------------------------------------

commit ff239787f7170a93f1015bd0f5582772b7b87f0a
Author: Simon Peyton Jones <simonpj at microsoft.com>
Date:   Wed Apr 26 17:43:24 2017 +0100

    Fix a small Float-Out bug
    
    The float-out pass uses a heuristic based on strictness info.
    But it was getting the strictness info mis-aligned; I'd forgotten
    that strictness flags only apply to /value/ arguments.
    
    This patch fixes it.  It has some surprising effects!
    
    --------------------------------------------------------------------------------
            Program           Size    Allocs   Runtime   Elapsed  TotalMem
    --------------------------------------------------------------------------------
            integer          -0.1%     +9.9%     +0.2%     +0.2%     +0.0%
               lcss          +0.0%     +0.0%    -11.9%    -11.9%     +0.0%
             queens          -0.2%    +29.0%      0.02      0.02     +0.0%
             simple          -0.1%    -22.6%    -21.7%    -21.7%     -3.6%
           treejoin          +0.0%     +0.0%    -12.3%    -12.6%     +0.0%
    --------------------------------------------------------------------------------
                Min          -0.2%    -22.6%    -21.7%    -21.7%    -10.0%
                Max          +3.3%    +29.0%    +19.2%    +19.2%    +50.0%
     Geometric Mean          +0.0%     +0.1%     -2.1%     -2.1%     +0.2%
    
    The 'queens' and 'integer' allocation regressions are because, just
    before let-floatting, we get
        \v -> foldr k z (case x of I# y -> build ..y..)
    
    Becase of Note [Case MFEs] we don't float the build; so fusion
    happens.  This increases allocation in queens because the build
    isn't shared; but actaully runtime improves solidly.  Situation
    is similar in integer, although I think runtime gets a bit worse.
    
    The bug meant that, because of foldr's type arguments, the
    mis-aligned strictness info meant that the entire (case x ...)
    was floated, so fusion failed, but sharing happened.
    
    This is all very artificial-benchmark-ish so I'm not losing sleep
    over it.
    
    I did see some runtime numbers increasd, but I think it's noise;
    the differnce went away when I tried them one by one afterwards.


>---------------------------------------------------------------

ff239787f7170a93f1015bd0f5582772b7b87f0a
 compiler/simplCore/SetLevels.hs | 39 +++++++++++++++++++++++++++++++--------
 1 file changed, 31 insertions(+), 8 deletions(-)

diff --git a/compiler/simplCore/SetLevels.hs b/compiler/simplCore/SetLevels.hs
index afca7ae..2b533b7 100644
--- a/compiler/simplCore/SetLevels.hs
+++ b/compiler/simplCore/SetLevels.hs
@@ -81,7 +81,7 @@ import Var
 import VarSet
 import VarEnv
 import Literal          ( litIsTrivial )
-import Demand           ( StrictSig, isStrictDmd, splitStrictSig, increaseStrictSigArity )
+import Demand           ( StrictSig, Demand, isStrictDmd, splitStrictSig, increaseStrictSigArity )
 import Name             ( getOccName, mkSystemVarName )
 import OccName          ( occNameString )
 import Type             ( Type, mkLamTypes, splitTyConApp_maybe )
@@ -95,7 +95,7 @@ import FastString
 import UniqDFM
 import FV
 import Data.Maybe
-import Control.Monad    ( zipWithM )
+import MonadUtils       ( mapAccumLM )
 
 {-
 ************************************************************************
@@ -402,7 +402,7 @@ lvlApp env orig_expr ((_,AnnVar fn), args)
         ; return (foldl App lapp' rargs') }
 
   | otherwise
-  = do { args' <- zipWithM (lvlMFE env) stricts args
+  = do { (_, args') <- mapAccumLM lvl_arg stricts args
             -- Take account of argument strictness; see
             -- Note [Floating to the top]
        ; return (foldl App (lookupVar env fn) args') }
@@ -410,12 +410,12 @@ lvlApp env orig_expr ((_,AnnVar fn), args)
     n_val_args = count (isValArg . deAnnotate) args
     arity      = idArity fn
 
-    stricts :: [Bool]   -- True for strict argument
+    stricts :: [Demand]   -- True for strict /value/ arguments
     stricts = case splitStrictSig (idStrictness fn) of
-                (arg_ds, _) | not (arg_ds `lengthExceeds` n_val_args)
-                            -> map isStrictDmd arg_ds ++ repeat False
+                (arg_ds, _) | arg_ds `lengthExceeds` n_val_args
+                            -> []
                             | otherwise
-                            -> repeat False
+                            -> arg_ds
 
     -- Separate out the PAP that we are floating from the extra
     -- arguments, by traversing the spine until we have collected
@@ -428,6 +428,19 @@ lvlApp env orig_expr ((_,AnnVar fn), args)
        | otherwise               = left n     f (a:rargs)
     left _ _ _                   = panic "SetLevels.lvlExpr.left"
 
+    is_val_arg :: CoreExprWithFVs -> Bool
+    is_val_arg (_, AnnType {}) = False
+    is_val_arg _               = True
+
+    lvl_arg :: [Demand] -> CoreExprWithFVs -> LvlM ([Demand], LevelledExpr)
+    lvl_arg strs arg | (str1 : strs') <- strs
+                     , is_val_arg arg
+                     = do { arg' <- lvlMFE env (isStrictDmd str1) arg
+                          ; return (strs', arg') }
+                     | otherwise
+                     = do { arg' <- lvlMFE env False arg
+                          ; return (strs, arg') }
+
 lvlApp env _ (fun, args)
   =  -- No PAPs that we can float: just carry on with the
      -- arguments and the function.
@@ -893,7 +906,17 @@ in exchange we build a thunk, which is bad.  This case reduces allocation
 by 7% in spectral/puzzle (a rather strange benchmark) and 1.2% in real/fem.
 Doesn't change any other allocation at all.
 
-We will make a separate decision for the scrutinees and alternatives.
+We will make a separate decision for the scrutinee and alternatives.
+
+However this can have a knock-on effect for fusion: consider
+    \v -> foldr k z (case x of I# y -> build ..y..)
+Perhaps we can float the entire (case x of ...) out of the \v.  Then
+fusion will not happen, but we will get more sharing.  But if we don't
+float the case (as advocated here) we won't float the (build ...y..)
+either, so fusion will happen.  It can be a big effect, esp in some
+artificial benchmarks (e.g. integer, queens), but there is no perfect
+answer.
+
 -}
 
 annotateBotStr :: Id -> Arity -> Maybe (Arity, StrictSig) -> Id