[GHC] #13316: Bad inlining cascade leads to slow optimisation
GHC
ghc-devs at haskell.org
Wed Feb 22 13:41:53 UTC 2017
#13316: Bad inlining cascade leads to slow optimisation
-------------------------------------+-------------------------------------
Reporter: simonpj | Owner: (none)
Type: bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 8.0.1
Keywords: | Operating System: Unknown/Multiple
Architecture: | Type of failure: None/Unknown
Unknown/Multiple |
Test Case: | Blocked By:
Blocking: | Related Tickets:
Differential Rev(s): | Wiki Page:
-------------------------------------+-------------------------------------
Consider this simple program
{{{
{-# INLINE [0] f #-}
f x y = case y of
True -> reverse x
False -> reverse (reverse x)
foo a b = let x = [a,a,a,a]
in f x b
}}}
If you compile with -DDEBUG you'll see
{{{
bash$ ghc-stage1 Foo.hs -O -fforce-recomp -c
WARNING: file compiler/simplCore/SimplCore.hs, line 670
Simplifier bailing out after 4 iterations [17, 1, 1, 1]
Size = {terms: 60, types: 49, coercions: 0, joins: 0/0}}}}
}}}
Yikes! One transformation per simplifier iteration! What is going on?
Before inlining `f` we have
{{{
(f [a,a,a,a] b)
}}}
Then, we inline, beta-reduce, and build let bindings thus
{{{
let x1 = a : []
x2 = a : x1
x3 = a : x2
x4 = a : x3
in case b of
True -> reverse x4
False -> reverse (reverse x4)
}}}
So far so good. But then, bizarrely, we do `postInlineUnconditionally` on
`x4` (see comments in that function). But not on `x3` because its occ-
info is "once inside lambda". Then in the next iteration we
`postInlineUnconditionally` `x3`, and so on.
Terrible, terrible. My thought: revisit these comments in
`postInlineUnconditionally`:
{{{
OneOcc { occ_in_lam = in_lam, occ_int_cxt = int_cxt }
-- OneOcc => no code-duplication issue
-> smallEnoughToInline dflags unfolding -- Small enough to
dup
-- ToDo: consider discount on smallEnoughToInline
if int_cxt is true
--
-- NB: Do NOT inline arbitrarily big things, even
if one_br is True
-- Reason: doing so risks exponential behaviour.
We simplify a big
-- expression, inline it, and simplify it
again. But if the
-- very same thing happens in the big
expression, we get
-- exponential cost!
-- PRINCIPLE: when we've already simplified an
expression once,
-- make sure that we only inline it if it's
reasonably small.
&& (not in_lam ||
-- Outside a lambda, we want to be reasonably
aggressive
-- about inlining into multiple branches of case
-- e.g. let x = <non-value>
-- in case y of { C1 -> ..x..; C2 -> ..x..;
C3 -> ... }
-- Inlining can be a big win if C3 is the hot-
spot, even if
-- the uses in C1, C2 are not 'interesting'
-- An example that gets worse if you add int_cxt
here is 'clausify'
}}}
What is particularly annoying in this case is that `x` is used in all code
paths, so all this inlining simply duplicates code while gaining nothing.
Moreover, if we had
{{{
let x = blah
in
case y of
A -> ...x...
B -> ..x..x...
C ->
}}}
then the multiple uses of `x` in the `B` branch would disable this entire
`preInlineUnconditionally` thing, even though it might be a good idea to
push the allocation of `x` into the `A` and `B` branches, to avoid the `C`
code path.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/13316>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list