[GHC] #14222: Simple text fusion example results in rather duplicative code

GHC ghc-devs at haskell.org
Fri Sep 15 13:33:24 UTC 2017


#14222: Simple text fusion example results in rather duplicative code
-------------------------------------+-------------------------------------
        Reporter:  bgamari           |                Owner:  (none)
            Type:  bug               |               Status:  new
        Priority:  normal            |            Milestone:
       Component:  Compiler          |              Version:  8.2.1
      Resolution:                    |             Keywords:  CSE
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  None/Unknown      |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by bgamari):

 Last night I looked further into why we end up duplicating the expession
 in question: There are two ways by which we end up duplicating the case
 analysis.

 == First Duplication ==

 First, relatively early in simplification we end up rewriting `isNumeric'`
 to (after inlining `(||)`),
 {{{#!hs
 isNumeric'_s2z6 :: Char -> Bool
 isNumeric'_s2z6
   = \ (eta_B1 :: Char) ->
       case eta_B1 of { GHC.Types.C# c2_a2sP ->
       case GHC.Prim.leChar# '0'# c2_a2sP of {
         __DEFAULT ->
           case c2_a2sP of {
             __DEFAULT -> GHC.Types.False;
             '+'# -> GHC.Types.True;
             '-'# -> GHC.Types.True;
             '.'# -> GHC.Types.True;
             'E'# -> GHC.Types.True;
             'e'# -> GHC.Types.True
           };
         1# ->
           case GHC.Prim.leChar# c2_a2sP '9'# of {
             __DEFAULT ->
               case c2_a2sP of {
                 __DEFAULT -> GHC.Types.False;
                 '+'# -> GHC.Types.True;
                 '-'# -> GHC.Types.True;
                 '.'# -> GHC.Types.True;
                 'E'# -> GHC.Types.True;
                 'e'# -> GHC.Types.True
               };
             1# -> GHC.Types.True
           }
       }
       }
 }}}
 Note the two instances of `case c2_a2sP of {...}`; this comes about via
 unconditional post-inlining since the scrutinee is trivial (being a
 `Var`).

 This makes me wonder whether there is any benefit for
 `postInlineUnconditional` to inline join points.

 == Second Duplication ==

 Then, later in simplification we duplicate `isNumeric'` three times
 (resulting in six instances of the `case` analysis). This happens when we
 start with this Core after worker/wrapper,
 {{{#!hs
                 join {
                   $w$j_s2Cy :: GHC.Prim.Char# -> Int -> Bool
                   $w$j_s2Cy (ww_s2Cw :: GHC.Prim.Char#) (w_s2Ct :: Int)
                     = let {
                         w_s2Cs :: Char
                         w_s2Cs = GHC.Types.C# ww_s2Cw } in
                       let {
                         x_a2vL :: Char
                         x_a2vL = w_s2Cs } in
                       let {
                         s'_a2vM :: Int
                         s'_a2vM = w_s2Ct } in
                       case x_a2vL of { GHC.Types.C# c2_a2t0 ->
                       join {
                         $j_s2yJ :: Bool
                         $j_s2yJ
                           = case c2_a2t0 of {
                               __DEFAULT -> GHC.Types.False;
                               '+'# -> jump loop_all_a2vd s'_a2vM;
                               '-'# -> jump loop_all_a2vd s'_a2vM;
                               '.'# -> jump loop_all_a2vd s'_a2vM;
                               'E'# -> jump loop_all_a2vd s'_a2vM;
                               'e'# -> jump loop_all_a2vd s'_a2vM
                             } } in
                       case GHC.Prim.leChar# '0'# c2_a2t0 of {
                         __DEFAULT -> jump $j_s2yJ;
                         1# ->
                           case GHC.Prim.leChar# c2_a2t0 '9'# of {
                             __DEFAULT -> jump $j_s2yJ;
                             1# -> jump loop_all_a2vd s'_a2vM
                           }
                       }
                       } } in
                 join {
                   -- the wrapper for the above worker
                   $j_s2At :: Char -> Int -> Bool
                   $j_s2At (w_s2Cs :: Char) (w_s2Ct :: Int)
                     = case w_s2Cs of ww_s2Cv { GHC.Types.C# ww_s2Cw ->
                       jump $w$j_s2Cy ww_s2Cw w_s2Ct
                       } } in
                 {- ... a large expression with three call-sites of $j_s2At
 -}
 }}}

 In the post-worker-wrapper simplification step we then immediately inline
 the wrappers into all of the call-sites,
 {{{
 Considering inlining: $j_s2At
   arg infos [ValueArg, ValueArg]
   interesting continuation BoringCtxt
   some_benefit True
   is exp: True
   is work-free: True
   guidance ALWAYS_IF(arity=2,unsat_ok=True,boring_ok=False)
   ANSWER = YES
 Inlining done: $j_s2At
     Inlined fn:  \ (w_s2Cs :: GHC.Types.Char)
                    (w_s2Ct :: GHC.Types.Int) ->
                    case w_s2Cs of { GHC.Types.C# ww_s2Cw -> jump $w$j
 ww_s2Cw w_s2Ct }
     Cont:   ApplyToVal nodup (GHC.Types.C#
                                 (GHC.Prim.chr# (GHC.Prim.word2Int# r#)))
             ApplyToVal nodup (GHC.Types.I# (GHC.Prim.+# ipv 1#))
             Stop[BoringCtxt] GHC.Types.Bool
 }}}
 and in each case inline the workers soon thereafter,
 {{{
 Considering inlining: $w$j_s2Cy
   arg infos [NonTrivArg, ValueArg]
   interesting continuation BoringCtxt
   some_benefit True
   is exp: True
   is work-free: True
   guidance IF_ARGS [126 120] 190 0
   discounted size = -35
   ANSWER = YES
 Inlining done: $w$j
     Inlined fn:  \ (ww_s2Cw :: GHC.Prim.Char#)
                    (w_s2Ct :: GHC.Types.Int) ->
                    join {
                      $j_s2yJ :: GHC.Types.Bool
                      $j_s2yJ
                        = case ww_s2Cw of {
                            __DEFAULT -> GHC.Types.False;
                            '+'# ->
                              case w_s2Ct of { GHC.Types.I# ww_X2Dh -> jump
 $wloop_all ww_X2Dh };
                            '-'# ->
                              case w_s2Ct of { GHC.Types.I# ww_X2Dh -> jump
 $wloop_all ww_X2Dh };
                            '.'# ->
                              case w_s2Ct of { GHC.Types.I# ww_X2Dh -> jump
 $wloop_all ww_X2Dh };
                            'E'# ->
                              case w_s2Ct of { GHC.Types.I# ww_X2Dh -> jump
 $wloop_all ww_X2Dh };
                            'e'# ->
                              case w_s2Ct of { GHC.Types.I# ww_X2Dh -> jump
 $wloop_all ww_X2Dh }
                          } } in
                    case GHC.Prim.leChar# '0'# ww_s2Cw of {
                      __DEFAULT -> jump $j_s2yJ;
                      1# ->
                        case GHC.Prim.leChar# ww_s2Cw '9'# of {
                          __DEFAULT -> jump $j_s2yJ;
                          1# ->
                            case w_s2Ct of { GHC.Types.I# ww_X2Dk -> jump
 $wloop_all ww_X2Dk }
                        }
                    }
     Cont:   ApplyToVal nodup ww_s2Cw
             ApplyToVal nodup w_s2Ct
             Stop[BoringCtxt] GHC.Types.Bool
 }}}

 We consequently end up with two identical and one nearly identical copies
 of the same rather large block of code. It's not immediately clear to me
 how we can determine whether this latter duplication is fruitful.

-- 
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/14222#comment:7>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list