# Why isn't (\$) inlining when I want?

Simon Peyton Jones simonpj at microsoft.com
Thu Aug 28 10:22:50 UTC 2014

```Oh, now I understand.  In
loop g = sum . map g \$ [1..1000000]
GHC can share [1..100000] across all calls to loop, although that nixes fusion.  Because each call of loop may have a different g.

But in
loop' = sum . map (+1) \$ [1..1000000]
GHC can share (sum . map (+1) \$ [1..1000]) across all calls to loop’, so it can readily fuse the sum, map, and [1..n].

I hope that explains it.

Simon

From: Dan Doel [mailto:dan.doel at gmail.com]
Sent: 27 August 2014 22:57
To: Simon Peyton Jones
Cc: David Feuer; ghc-devs
Subject: Re: Why isn't (\$) inlining when I want?

I think talking about inlining of \$ may not be addressing the crux of the problem here.

The issue seems to be about functions like the one in the first message. For instance:
loop :: (Int -> Int) -> Int
loop g = sum . map g \$ [1..1000000]
Suppose for argument that we have a fusion framework that would handle this. The problem is that this does not actually turn into a loop over integers, because the constant [1..1000000] gets floated out. It instead builds a list/vector/whatever.
By contrast, if we write:
loop' :: Int
loop' = sum . map (+1) \$ [1..1000000]
this does turn into a loop over integers, with no intermediate list. Presumably this is due to there being no work to be saved ever by floating the list out. These are the examples people usually test fusion with.
And if loop is small enough to inline, it turns out that the actual code that gets run will be the same as loop', because everything will get inlined and fused. But it is also possible to make loop big enough to not inline, and then the floating will pessimize the overall code.
So the core issue is that constant floating blocks some fusion opportunities. It is trying to save the work of building the structure more than once, but fusion can cause the structure to not be built at all. And the floating happens before fusion can reasonably be expected to work.
I've verified that this kind of situation also affects vector. And it seems to be an issue even if loop is written:
loop g = sum (map g [1..1000000])
-- Dan

On Wed, Aug 27, 2014 at 3:38 PM, Simon Peyton Jones <simonpj at microsoft.com<mailto:simonpj at microsoft.com>> wrote:
You'll have to do more detective work! In your dump I see "Inactive unfolding \$".  So that's why it's not being inlined.  That message comes from CoreUnfold, line 941 or so.  The Boolean active_unfolding is passed in to callSiteInline from Simplify, line 1408 or so.  It is generated by the function activeUnfolding, defined in SimplUtils.

But you have probably change the "CompilerPhase" data type, so I can't guess what is happening.  But if you just follow it through I'm sure you'll find it.

Simon

| -----Original Message-----
| From: David Feuer [mailto:david.feuer at gmail.com<mailto:david.feuer at gmail.com>]
| Sent: 27 August 2014 17:22
| To: Simon Peyton Jones
| Cc: ghc-devs
| Subject: Re: Why isn't (\$) inlining when I want?
|
| I just ran that (results attached), and as far as I can tell, it
| doesn't even *consider* inlining (\$) until phase 2.
|
| On Wed, Aug 27, 2014 at 4:03 AM, Simon Peyton Jones
| <simonpj at microsoft.com<mailto:simonpj at microsoft.com>> wrote:
| > It's hard to tell since you are using a modified compiler.  Try running
| with -ddump-occur-anal -dverbose-core2core -ddump-inlinings.  That will
| show you every inlining, whether failed or successful. You can see the
| attempt to inline (\$) and there is some info with the output that may
| help to explain why it did or did not work.
| >
| > Try that
| >
| > Simon
| >
| > | -----Original Message-----
| > | From: ghc-devs [mailto:ghc-devs-bounces at haskell.org<mailto:ghc-devs-bounces at haskell.org>] On Behalf Of
| David
| > | Feuer
| > | Sent: 27 August 2014 04:50
| > | To: ghc-devs; Carter Schonwald
| > | Subject: Why isn't (\$) inlining when I want?
| > |
| > | tl;dr  I added a simplifier run with inlining enabled between
| > | specialization and floating out. It seems incapable of inlining
| > | saturated applications of (\$), and I can't figure out why. These are
| > | inlined later, when phase 2 runs. Am I running the simplifier wrong
| or
| > | something?
| > |
| > |
| > | I'm working on this simple little fusion pipeline:
| > |
| > | {-# INLINE takeWhile #-}
| > | takeWhile p xs = build builder
| > |   where
| > |     builder c n = foldr go n xs
| > |       where
| > |         go x r = if p x then x `c` r else n
| > |
| > | foo c n x = foldr c n . takeWhile (/= (1::Int)) \$ [-9..10]
| > |
| > | There are some issues with the enumFrom definition that break things.
| > | If I use a fusible unfoldr that produces some numbers instead, that
| > | issue goes away. Part of that problem (but not all of it) is that the
| > | simplifier doesn't run to apply rules between specialization and full
| > | laziness, so there's no opportunity for the specialization of
| > | enumFromTo to Int to trigger the rewrite to a build form and fusion
| > | with foldr before full laziness tears things apart. The other problem
| > | is that inlining doesn't happen at all before full laziness, so
| things
| > | defined using foldr and/or build aren't actually exposed as such
| until
| > | afterwards. Therefore I decided to try adding a simplifier run with
| > | inlining between specialization and floating out:
| > |
| > |         runWhen do_specialise CoreDoSpecialising,
| > |
| > |         runWhen full_laziness \$ CoreDoSimplify max_iter
| > |                        (base_mode { sm_phase = InitialPhase
| > |                                   , sm_names = ["PostGentle"]
| > |                                   , sm_rules = rules_on
| > |                                   , sm_inline = True
| > |                                   , sm_case_case = False }),
| > |
| > |         runWhen full_laziness \$
| > |            CoreDoFloatOutwards FloatOutSwitches {
| > |                                  floatOutLambdas   = Just 0,
| > |                                  floatOutConstants = True,
| > |                                  floatOutPartialApplications = False
| },
| > |
| > | The weird thing is that for some reason this doesn't inline (\$), even
| > | though it appears to be saturated. Using the modified thing with (my
| > | version of) unfoldr:
| > |
| > | foo c n x = (foldr c n . takeWhile (/= (1::Int))) \$ unfoldr (potato
| 10)
| > | (-9)
| > |
| > | potato :: Int -> Int -> Maybe (Int, Int)
| > | potato n m | m <= n = Just (m, m)
| > |            | otherwise = Nothing
| > |
| > |
| > | I get this out of the specializer:
| > |
| > | foo
| > | foo =
| > |   \ @ t_a1Za @ c_a1Zb c_a1HT n_a1HU _ ->
| > |     \$ (. (foldr c_a1HT n_a1HU)
| > |          (takeWhile
| > |             (let {
| > |                ds_s21z
| > |                ds_s21z = I# 1 } in
| > |              \ ds_d1Zw -> neInt ds_d1Zw ds_s21z)))
| > |       (let {
| > |          n_s21x
| > |          n_s21x = I# 10 } in
| > |        unfoldr
| > |          (\ m_a1U7 ->
| > |             case leInt m_a1U7 n_s21x of _ {
| > |               False -> Nothing;
| > |               True -> Just (m_a1U7, m_a1U7)
| > |             })
| > |          (\$fNumInt_\$cnegate (I# 9)))
| > |
| > |
| > | and then I get this out of my extra simplifier run:
| > |
| > | foo
| > | foo =
| > |   \ @ t_a1Za @ c_a1Zb c_a1HT n_a1HU _ ->
| > |     \$ (\ x_a20f ->
| > |          foldr
| > |            (\ x_a1HR r_a1HS ->
| > |               case case x_a1HR of _ { I# x_a20R ->
| > |                    tagToEnum#
| > |                      (case x_a20R of _ {
| > |                         __DEFAULT -> 1;
| > |                         1 -> 0
| > |                       })
| > |                    }
| > |               of _ {
| > |                 False -> n_a1HU;
| > |                 True -> c_a1HT x_a1HR r_a1HS
| > |               })
| > |            n_a1HU
| > |            x_a20f)
| > |       (let {
| > |          b'_a1ZS
| > |          b'_a1ZS = \$fNumInt_\$cnegate (I# 9) } in
| > |        \$ (build)
| > |          (\ @ b1_a1ZU c_a1ZV n_a1ZW ->
| > |             letrec {
| > |               go_a1ZX
| > |               go_a1ZX =
| > |                 \ b2_a1ZY ->
| > |                   case case case b2_a1ZY of _ { I# x_a218 ->
| > |                             tagToEnum# (<=# x_a218 10)
| > |                             }
| > |                        of _ {
| > |                          False -> Nothing;
| > |                          True -> Just (b2_a1ZY, b2_a1ZY)
| > |                        }
| > |                   of _ {
| > |                     Nothing -> n_a1ZW;
| > |                     Just ds_a203 ->
| > |                       case ds_a203 of _ { (a1_a207, new_b_a208) ->
| > |                       c_a1ZV a1_a207 (go_a1ZX new_b_a208)
| > |                       }
| > |                   }; } in
| > |             go_a1ZX b'_a1ZS))
| > |
| > |
| > | That is, neither the \$ in the code nor the \$ that was inserted when
| > | inlining unfoldr got inlined themselves, even though both appear to
| be
| > | saturated. As a result, foldr/build doesn't fire, and full laziness
| > | tears things apart. Later on, in simplifier phase 2, \$ gets inlined.
| > | What's preventing this from happening in the PostGentle phase I
| > |
| > | David Feuer
| > | _______________________________________________
| > | ghc-devs mailing list