[GHC] #14936: GHC 8.4 performance regressions when using newtypes

GHC ghc-devs at haskell.org
Sun Mar 18 18:49:25 UTC 2018


#14936: GHC 8.4 performance regressions when using newtypes
-------------------------------------+-------------------------------------
        Reporter:  danilo2           |                Owner:  (none)
            Type:  bug               |               Status:  new
        Priority:  highest           |            Milestone:
       Component:  Compiler          |              Version:  8.4.1
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  None/Unknown      |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by RyanGlScott):

 Here is a slighter smaller example to demonstrate the issue:

 {{{#!hs
 {-# LANGUAGE BangPatterns #-}
 module Bug2 where

 import Control.Monad.Trans.State.Strict

 newtype Foo a = Foo a

 slowGo :: Int -> StateT (Foo (Int, Int)) IO ()
 slowGo 0 = pure ()
 slowGo j = do
   Foo (!_, !off) <- get
   slowGo (j - 1)

 fastGo :: Int -> StateT (Int, Int) IO ()
 fastGo 0 = pure ()
 fastGo j = do
   (!_, !off) <- get
   fastGo (j - 1)
 }}}

 In GHC 8.2.2, if you compare the Core between these two functions (in the
 `_$s_$w` functions that perform most of the work):

 {{{
 $ /opt/ghc/8.2.2/bin/ghc Bug2.hs -O2 -fforce-recomp -ddump-simpl
 -dsuppress-idinfo -dsuppress-uniques -dsuppress-module-prefixes
 -dsuppress-coercions
 [1 of 1] Compiling Bug2             ( Bug2.hs, Bug2.o )

 ==================== Tidy Core ====================
 Result size of Tidy Core
   = {terms: 190, types: 298, coercions: 60, joins: 0/0}

 ...

 Rec {
 -- RHS size: {terms: 24, types: 23, coercions: 0, joins: 0/0}
 fastGo_$s$wfastGo
   :: State# RealWorld
      -> Int# -> Int# -> Int# -> (# State# RealWorld, ((), (Int, Int)) #)
 fastGo_$s$wfastGo
   = \ (sc :: State# RealWorld)
       (sc1 :: Int#)
       (sc2 :: Int#)
       (sc3 :: Int#) ->
       case sc3 of ds {
         __DEFAULT -> fastGo_$s$wfastGo sc sc1 sc2 (-# ds 1#);
         0# -> (# sc, ((), (I# sc1, I# sc2)) #)
       }
 end Rec }

 ...

 Rec {
 -- RHS size: {terms: 25, types: 37, coercions: 6, joins: 0/0}
 slowGo_$s$wslowGo
   :: State# RealWorld
      -> Int#
      -> Int#
      -> ((Int, Int) :: *) ~R# (Foo (Int, Int) :: *) =>
         Int# -> (# State# RealWorld, ((), Foo (Int, Int)) #)
 slowGo_$s$wslowGo
   = \ (sc :: State# RealWorld)
       (sc1 :: Int#)
       (sc2 :: Int#)
       (sg :: ((Int, Int) :: *) ~R# (Foo (Int, Int) :: *))
       (sc3 :: Int#) ->
       case sc3 of ds {
         __DEFAULT -> slowGo_$s$wslowGo sc sc1 sc2 @~ <Co:5> (-# ds 1#);
         0# -> (# sc, ((), (I# sc1, I# sc2) `cast` <Co:1>) #)
       }
 end Rec }
 }}}

 Then they are essentially identical (the `slowGo` one has an extra
 argument of type `((Int, Int) :: *) ~R# (Foo (Int, Int) :: *)`, but that
 is zero-width, so it shouldn't have any effect at runtime).

 On the other hand, in GHC 8.4.1:

 {{{
 $ ~/Software/ghc-8.4.1/bin/ghc Bug2.hs -O2 -fforce-recomp -ddump-simpl
 -dsuppress-idinfo -dsuppress-uniques -dsuppress-module-prefixes
 -dsuppress-coercions
 [1 of 1] Compiling Bug2             ( Bug2.hs, Bug2.o )

 ==================== Tidy Core ====================
 Result size of Tidy Core
   = {terms: 163, types: 231, coercions: 54, joins: 0/0}

 ...

 Rec {
 -- RHS size: {terms: 24, types: 23, coercions: 0, joins: 0/0}
 fastGo_$s$wfastGo
   :: State# RealWorld
      -> Int# -> Int# -> Int# -> (# State# RealWorld, ((), (Int, Int)) #)
 fastGo_$s$wfastGo
   = \ (sc :: State# RealWorld)
       (sc1 :: Int#)
       (sc2 :: Int#)
       (sc3 :: Int#) ->
       case sc3 of ds {
         __DEFAULT -> fastGo_$s$wfastGo sc sc1 sc2 (-# ds 1#);
         0# -> (# sc, ((), (I# sc1, I# sc2)) #)
       }
 end Rec }

 ...

 Rec {
 -- RHS size: {terms: 27, types: 34, coercions: 9, joins: 0/0}
 $wslowGo
   :: Int#
      -> Foo (Int, Int)
      -> State# RealWorld
      -> (# State# RealWorld, ((), Foo (Int, Int)) #)
 $wslowGo
   = \ (ww :: Int#) (w :: Foo (Int, Int)) (w1 :: State# RealWorld) ->
       case ww of ds {
         __DEFAULT ->
           case w `cast` <Co:4> of wild { (ds1, off) ->
           case ds1 of { I# ipv ->
           case off of { I# ipv1 ->
           $wslowGo (-# ds 1#) (wild `cast` <Co:5>) w1
           }
           }
           };
         0# -> (# w1, ((), w) #)
       }
 end Rec }
 }}}

 This time, `slowGo` doesn't have something akin to `slowGo_$s$wslowGo`.
 Instead, it performs the body of the loop in `$wslowGo`, which uses `Foo
 (Int, Int)` instead of two unboxed `Int#` arguments. I could imagine that
 this alone contributes to the slowdown.

-- 
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/14936#comment:2>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list