[Haskell-cafe] Re: GHC Optimization Issue

Brandon Moore brandonm at yahoo-inc.com
Thu Dec 22 10:39:53 EST 2005


I think what's happening here is that the rewrite rule SPECIALIZE is 
producing turns the call to swap16 into a call to the specialized 
verision before any inlining happens[1], and then the specialized 
version can't be inlined[2].

Manually writing out a copy of swap16 at type Word16 -> Word16, marking 
it INLINE and setting up a rewite rule for specialization gives you the 
more efficient code:

changing the definition to swap16 to this:

swap16 :: Bits a => a -> a
swap16 v = (v `shiftR` 8) + ((v .&. 0xFF) `shiftL` 8)

{- # RULES "swap16/Word16" forall (v :: Int16) . swap16 v = swap16Word16 
v #-}
{- # INLINE swap16Word16 #-}

swap16Word16 :: Word16 -> Word16
swap16Word16 v = (v `shiftR` 8) + ((v .&. 0xFF) `shiftL` 8)

gives this Core for foo:
Main.foo :: GHC.Word.Word16
[GlobalId]
[Str: DmdType]
Main.foo = case GHC.Word.Word16 GHC.Word.$wshift3 __word 3855 (-8)
            of ww1 { __DEFAULT ->
            case GHC.Word.Word16 GHC.Word.$wshift3 __word 15 8 of ww { 
__DEFAULT ->
            GHC.Word.W16# (GHC.Prim.narrow16Word# (GHC.Prim.plusWord# 
ww1 ww))
            }
            }

and for swapWord16:

Swap.swap16Word16 :: GHC.Word.Word16 -> GHC.Word.Word16
[GlobalId]
[Arity 1
  NoCafRefs
  Str: DmdType U(L)m]
Swap.swap16Word16 = __inline_me (\ (v :: GHC.Word.Word16) ->
                                    GHC.Word.+2
                                      (GHC.Word.$dmshiftR2 v 
(GHC.Base.I# 8))
                                      (GHC.Word.shift2 (GHC.Word..&.2 v 
Swap.lit) (GHC.Base.I# 8)))


Removing the INLINE directive for swap16Word16 leaves this code for foo, 
with a call to the worker function for swap16Word16:

Main.foo :: GHC.Word.Word16
[GlobalId]
[Str: DmdType]
Main.foo = case GHC.Word.Word16 Swap.$wswap16Word16 __word 3855
            of ww1 { __DEFAULT ->
            GHC.Word.W16# ww1
            }

and these definitions for swap16Word16:

Swap.$wswap16Word16 :: GHC.Prim.Word# -> GHC.Prim.Word#
[GlobalId]
[Arity 1
  NoCafRefs
  Str: DmdType L]
Swap.$wswap16Word16 = \ (ww :: GHC.Prim.Word#) ->
                         case GHC.Prim.Word# GHC.Word.$wshift3 ww (-8) 
of ww1 { __DEFAULT ->
                         case GHC.Prim.Word# GHC.Word.$wshift3 
(GHC.Prim.and# ww __word 255) 8
                         of ww11 { __DEFAULT ->
                         GHC.Prim.narrow16Word# (GHC.Prim.plusWord# ww1 
ww11)
                         }
                         }

Swap.swap16Word16 :: GHC.Word.Word16 -> GHC.Word.Word16
[GlobalId]
[Arity 1
  Worker Swap.$wswap16Word16
  NoCafRefs
  Str: DmdType U(L)m]
Swap.swap16Word16 = __inline_me (\ (w :: GHC.Word.Word16) ->
                                    case GHC.Word.Word16 w of w1 { 
GHC.Word.W16# ww ->
                                    case GHC.Word.Word16 
Swap.$wswap16Word16 ww of ww1 { __DEFAULT ->
                                    GHC.Word.W16# ww1
                                    }
                                    })

It looks like marking the function INLINE supresses the worker/wrapper 
split, and when there is not INLINE and the function get split, only the 
wrapper is eligible for inlining.

Brandon Moore

[1] (section 10.7.2 of the 6.4.1 User's guide says "In the earlier 
phases of compilation, GHC inlines nothing that appears on the LHS of a 
rule, because once you have substituted for something you can't match 
against it (given the simple minded matching).")

[2] Maybe because the worker isn't annotation with __inline_me in the 
Core? Maybe from the aside in 7.10.5 about and INLINE pragma preventing 
any inlining in the RHS of the thing marked inline?


More information about the Haskell-Cafe mailing list