[GHC] #14941: Switching direct type family application to EqPred (~) prevents inlining in code using vector (10x slowdown)

GHC ghc-devs at haskell.org
Fri Sep 14 16:18:22 UTC 2018


#14941: Switching direct type family application to EqPred (~) prevents inlining in
code using vector (10x slowdown)
-------------------------------------+-------------------------------------
        Reporter:  nh2               |                Owner:  davide
            Type:  bug               |               Status:  new
        Priority:  normal            |            Milestone:
       Component:  Compiler          |              Version:  8.2.2
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
 Type of failure:  Runtime           |  Unknown/Multiple
  performance bug                    |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by davide):

 I've created a simple case where this happens:
 {{{#!haskell
 {-# LANGUAGE ScopedTypeVariables #-}
 {-# LANGUAGE TypeFamilies #-}

 module F (f) where

 f :: Int -> Int                           -- FAST
 -- f :: forall a. (a ~ Int) => a -> a     -- SLOW
 f x = x + x
 {-# NOINLINE f #-}
 }}}

 Compiling with:
 {{{
 $ ghc-8.4.3 -O -ddump-simpl -dsuppress-coercions F.hs
 }}}
 I get this core (note worker-wrapper transformation):
 {{{
 -- RHS size: {terms: 4, types: 1, coercions: 0, joins: 0/0}
 F.$wf [InlPrag=NOINLINE] :: GHC.Prim.Int# -> GHC.Prim.Int#
 [GblId, Arity=1, Caf=NoCafRefs, Str=<S,U>]
 F.$wf = \ (ww_s1ay :: GHC.Prim.Int#) -> GHC.Prim.+# ww_s1ay ww_s1ay

 -- RHS size: {terms: 10, types: 4, coercions: 0, joins: 0/0}
 f [InlPrag=NOUSERINLINE[0]] :: Int -> Int
 [GblId,
  Arity=1,
  Caf=NoCafRefs,
  Str=<S(S),1*U(U)>m,
  Unf=Unf{Src=InlineStable, TopLvl=True, Value=True, ConLike=True,
          WorkFree=True, Expandable=True,
          Guidance=ALWAYS_IF(arity=1,unsat_ok=True,boring_ok=False)
          Tmpl= \ (w_s1av [Occ=Once!] :: Int) ->
                  case w_s1av of { GHC.Types.I# ww1_s1ay [Occ=Once] ->
                  case F.$wf ww1_s1ay of ww2_s1aC { __DEFAULT ->
                  GHC.Types.I# ww2_s1aC
                  }
                  }}]
 f = \ (w_s1av :: Int) ->
       case w_s1av of { GHC.Types.I# ww1_s1ay ->
       case F.$wf ww1_s1ay of ww2_s1aC { __DEFAULT ->
       GHC.Types.I# ww2_s1aC
       }
       }
 }}}
 Swapping to `f :: forall a. (a ~ Int) => a -> a` gives:
 {{{
 -- RHS size: {terms: 10, types: 21, coercions: 12, joins: 0/0}
 f [InlPrag=NOINLINE] :: forall a. ((a :: *) ~ (Int :: *)) => a -> a
 [GblId, Arity=2, Caf=NoCafRefs, Str=<S(S),1*U(1*U)><S,U>m]
 f = \ (@ a_a13U)
       ($d~_a13W :: (a_a13U :: *) ~ (Int :: *))
       (eta_B1 :: a_a13U) ->
       case GHC.Types.HEq_sc
              @ * @ * @ a_a13U @ Int ($d~_a13W `cast` <Co:5>)
       of co_a14c
       { __DEFAULT ->
       (GHC.Num.$fNumInt_$c+
          (eta_B1 `cast` <Co:2>) (eta_B1 `cast` <Co:2>))
       `cast` <Co:3>
       }
 }}}
 I ran a benchmark to confirm the performance difference:
 {{{
 benchmarking Int -> Int
 time                 12.47 ns   (12.43 ns .. 12.55 ns)
                      1.000 R²   (1.000 R² .. 1.000 R²)
 mean                 12.53 ns   (12.48 ns .. 12.59 ns)
 std dev              173.4 ps   (140.2 ps .. 239.2 ps)
 variance introduced by outliers: 17% (moderately inflated)

 benchmarking (a ~ Int) => a -> a
 time                 15.72 ns   (15.69 ns .. 15.76 ns)
                      1.000 R²   (1.000 R² .. 1.000 R²)
 mean                 15.80 ns   (15.74 ns .. 16.01 ns)
 std dev              327.8 ps   (135.0 ps .. 691.2 ps)
 variance introduced by outliers: 32% (moderately inflated)
 }}}

-- 
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/14941#comment:7>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list