[Git][ghc/ghc][wip/simplifier-tweaks] Wibble

Simon Peyton Jones (@simonpj) gitlab at gitlab.haskell.org
Fri Jul 7 15:55:00 UTC 2023



Simon Peyton Jones pushed to branch wip/simplifier-tweaks at Glasgow Haskell Compiler / GHC


Commits:
b3e1e33a by Simon Peyton Jones at 2023-07-07T16:54:46+01:00
Wibble

- - - - -


2 changed files:

- compiler/GHC/Core/Opt/Simplify/Iteration.hs
- compiler/GHC/Core/Opt/Simplify/Utils.hs


Changes:

=====================================
compiler/GHC/Core/Opt/Simplify/Iteration.hs
=====================================
@@ -1931,6 +1931,49 @@ things] in Simplify.Utils.  This certainly risks repeated simplification, but
 in practice seems to be a small win.
 
 
+Note [Avoiding exponential inlining]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+In #13253, and several related tickets, we got an exponential blowup
+in code size from a program that looks like this. NB:  see also
+Note [Do not add unfoldings to join points at birth]
+
+  let j1a x = case f y     of { True -> p;   False -> q }
+      j1b x = case f y     of { True -> q;   False -> p }
+      j2a x = case f (y+1) of { True -> j1a x; False -> j1b x}
+      j2b x = case f (y+1) of { True -> j1b x; False -> j1a x}
+      ...
+  in case f (y+10) of { True -> j10a 7; False -> j10b 8 }
+
+The first danger is this: in Simplifier iteration 1 postInlineUnconditionally
+inlines the last functions, j10a and j10b (they are both small).  Now we have
+two calls to j9a and two to j9b.  In the next Simplifer iteration,
+postInlineUnconditionally inlines all four of these calls, leaving four calls
+to j8a and j8b. Etc.  Yikes!  This is exponential!
+
+This probably /won't/ happen because the Simplifier works top down, so it'll
+inline j1a/j1b into j2a/j2b, which will make the latter bigger; so the process
+will stop.
+
+A related problem.  Suppose the RHSs are too big for postInlineUnconditionally,
+and the calls in the RHSs are not interesting enough to promote inlining.  But
+the calls in the body (j10a 7) etc, might be interesting enough. So j10a inlines
+and that might make j9a inline, and so on in an upward cascade.
+
+A possible plan: stop doing postInlineUnconditionally
+for some fixed, smallish number of branches, say 4. But that turned
+out to be bad: see Note [Inline small things to avoid creating a thunk].
+And, as it happened, the problem with #13253 was solved in a
+different way (Note [Duplicating StrictArg] in Simplify).
+
+So I just set an arbitrary, high limit of 100, to stop any
+totally exponential behaviour.
+
+This still leaves the nasty possibility that /ordinary/ inlining (not
+postInlineUnconditionally) might inline these join points, each of
+which is individually quiet small.  I'm still not sure what to do
+about this (e.g. see #15488).
+
+
 ************************************************************************
 *                                                                      *
                      Join points
@@ -2384,7 +2427,7 @@ Then given (f Int e1) we rewrite to
    (\x. x True) e1
 without simplifying e1.  Now we can inline x into its unique call site,
 and absorb the True into it all in the same pass.  If we simplified
-e1 first, we couldn't do that; see Note [Avoiding exponential behaviour].
+e1 first, we couldn't do that; see Note [Avoiding simplifying repeatedly].
 
 So we try to apply rules if either
   (a) no_more_args: we've run out of argument that the rules can "see"
@@ -3914,6 +3957,7 @@ mkDupableAlt _env case_bndr jfloats (Alt con alt_bndrs alt_rhs_in)
               --                  mkUnfolding uf_opts VanillaSrc False False join_rhs Nothing
               -- uf_opts   = seUnfoldingOpts env
               join_bndr_w_unf = join_bndr
+              -- See Note [Do not add unfoldings to join points at birth]
               join_call = mkApps (Var join_bndr) final_args
               alt'      = Alt con alt_bndrs join_call
 
@@ -3936,6 +3980,37 @@ ok_to_dup_alt case_bndr alt_bndrs alt_rhs
     bndr_set = mkVarSet (case_bndr : alt_bndrs)
 
 {-
+Note [Do not add unfoldings to join points at birth]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Consider this (#15360)
+
+   case (case (case (case ...))) of
+      Left x  -> e1
+      Right y -> e2
+
+We will make a join point for e1, e2, thus
+    $j1a x = e1
+    $j1b y = e2
+
+Now those join points count as "duplicable" , so we feel free to duplicate
+them into the loop nest.  And each of those calls are then subject to
+callSiteInline, which might inline them, if e1, e2 are reasonably small.  Now,
+if this applies recursive to the next `case` inwards, and so on, the net
+effect is that we can get an exponential number of calls to $j1a and $j1b, and
+an exponential number of inlinings (since each is done independently).
+
+This hit #15360 (not a complicated program) badly.  Out brutal solution is this:
+when a join point is born, we don't give it an unfolding.  So we end up with
+    $j1a x = e1
+    $j1b y = e2
+    $j2a x = ...$j1a ... $j1b...
+    $j2b x = ...$j1a ... $j1b...
+    ... and so on...
+
+Now we are into Note [Avoiding exponential inlining], which is still
+a challenge.  But at least we have a chance.  If we add inlinings at birth
+we never get that chance.
+
 Note [Duplicating alternatives]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 When should we duplicate an alternative, and when should we make a join point?
@@ -4329,7 +4404,7 @@ simplLetUnfolding env bind_cxt id new_rhs rhs_ty arity unf
     let !opts = seUnfoldingOpts env
     in mkLetUnfolding opts (bindContextLevel bind_cxt) VanillaSrc id new_rhs
   where
-    -- See Note [Exponential join point inlining]
+    -- ToDo: document this
     too_many_occs (ManyOccs {})             = True
     too_many_occs (OneOcc { occ_n_br = n }) = n > 10
     too_many_occs IAmDead                   = False


=====================================
compiler/GHC/Core/Opt/Simplify/Utils.hs
=====================================
@@ -1543,7 +1543,7 @@ postInlineUnconditionally env bind_cxt old_bndr bndr rhs
         -- See Note [Inline small things to avoid creating a thunk]
 
         | let not_inside_lam = in_lam == NotInsideLam
-        -> n_br < 100  -- See Note [Suppress exponential blowup]
+        -> n_br < 100  -- See Note [Avoiding exponential inlining] in Simplify.Iteration
 
            && (  (n_br == 1 && not_inside_lam)  -- See Note [Post-inline for single-use things]
               || smallEnoughToInline uf_opts unfolding)  -- Small enough to dup
@@ -1642,48 +1642,6 @@ But
 Alas!
 
 
-Note [Suppress exponential blowup]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-In #13253, and several related tickets, we got an exponential blowup
-in code size from postInlineUnconditionally.  The trouble comes when
-we have
-  let j1a = case f y     of { True -> p;   False -> q }
-      j1b = case f y     of { True -> q;   False -> p }
-      j2a = case f (y+1) of { True -> j1a; False -> j1b }
-      j2b = case f (y+1) of { True -> j1b; False -> j1a }
-      ...
-  in case f (y+10) of { True -> j10a; False -> j10b }
-
-when there are many branches. In pass 1, postInlineUnconditionally
-inlines j10a and j10b (they are both small).  Now we have two calls
-to j9a and two to j9b.  In pass 2, postInlineUnconditionally inlines
-all four of these calls, leaving four calls to j8a and j8b. Etc.
-Yikes!  This is exponential!
-
-A similar case 
-  let j1 x = ...
-      j2 x = ...jump j1 (x-1).....jump j1 (x-2)
-      ...
-  in case f (y+10) of { True -> jump j10 10; False -> j10 10 }
-
-In the RHS of j1..j10, no inlining happens because the calls don't look
-exciting enough.  But in the "in" part, the call-site inliner may inline
-j10 (since it is applied to 10).  That exposts 
-In each 
-A possible plan: stop doing postInlineUnconditionally
-for some fixed, smallish number of branches, say 4. But that turned
-out to be bad: see Note [Inline small things to avoid creating a thunk].
-And, as it happened, the problem with #13253 was solved in a
-different way (Note [Duplicating StrictArg] in Simplify).
-
-So I just set an arbitrary, high limit of 100, to stop any
-totally exponential behaviour.
-
-This still leaves the nasty possibility that /ordinary/ inlining (not
-postInlineUnconditionally) might inline these join points, each of
-which is individually quiet small.  I'm still not sure what to do
-about this (e.g. see #15488).
-
 Note [Top level and postInlineUnconditionally]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 We don't do postInlineUnconditionally for top-level things (even for



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/b3e1e33a38f0b88f701e5c7be5398acbe4b57a25

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/b3e1e33a38f0b88f701e5c7be5398acbe4b57a25
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20230707/664fdafb/attachment-0001.html>


More information about the ghc-commits mailing list