[Haskell-cafe] Re: GHC Optimization Issue
Brandon Moore
brandonm at yahoo-inc.com
Thu Dec 22 10:39:53 EST 2005
I think what's happening here is that the rewrite rule SPECIALIZE is
producing turns the call to swap16 into a call to the specialized
verision before any inlining happens[1], and then the specialized
version can't be inlined[2].
Manually writing out a copy of swap16 at type Word16 -> Word16, marking
it INLINE and setting up a rewite rule for specialization gives you the
more efficient code:
changing the definition to swap16 to this:
swap16 :: Bits a => a -> a
swap16 v = (v `shiftR` 8) + ((v .&. 0xFF) `shiftL` 8)
{- # RULES "swap16/Word16" forall (v :: Int16) . swap16 v = swap16Word16
v #-}
{- # INLINE swap16Word16 #-}
swap16Word16 :: Word16 -> Word16
swap16Word16 v = (v `shiftR` 8) + ((v .&. 0xFF) `shiftL` 8)
gives this Core for foo:
Main.foo :: GHC.Word.Word16
[GlobalId]
[Str: DmdType]
Main.foo = case GHC.Word.Word16 GHC.Word.$wshift3 __word 3855 (-8)
of ww1 { __DEFAULT ->
case GHC.Word.Word16 GHC.Word.$wshift3 __word 15 8 of ww {
__DEFAULT ->
GHC.Word.W16# (GHC.Prim.narrow16Word# (GHC.Prim.plusWord#
ww1 ww))
}
}
and for swapWord16:
Swap.swap16Word16 :: GHC.Word.Word16 -> GHC.Word.Word16
[GlobalId]
[Arity 1
NoCafRefs
Str: DmdType U(L)m]
Swap.swap16Word16 = __inline_me (\ (v :: GHC.Word.Word16) ->
GHC.Word.+2
(GHC.Word.$dmshiftR2 v
(GHC.Base.I# 8))
(GHC.Word.shift2 (GHC.Word..&.2 v
Swap.lit) (GHC.Base.I# 8)))
Removing the INLINE directive for swap16Word16 leaves this code for foo,
with a call to the worker function for swap16Word16:
Main.foo :: GHC.Word.Word16
[GlobalId]
[Str: DmdType]
Main.foo = case GHC.Word.Word16 Swap.$wswap16Word16 __word 3855
of ww1 { __DEFAULT ->
GHC.Word.W16# ww1
}
and these definitions for swap16Word16:
Swap.$wswap16Word16 :: GHC.Prim.Word# -> GHC.Prim.Word#
[GlobalId]
[Arity 1
NoCafRefs
Str: DmdType L]
Swap.$wswap16Word16 = \ (ww :: GHC.Prim.Word#) ->
case GHC.Prim.Word# GHC.Word.$wshift3 ww (-8)
of ww1 { __DEFAULT ->
case GHC.Prim.Word# GHC.Word.$wshift3
(GHC.Prim.and# ww __word 255) 8
of ww11 { __DEFAULT ->
GHC.Prim.narrow16Word# (GHC.Prim.plusWord# ww1
ww11)
}
}
Swap.swap16Word16 :: GHC.Word.Word16 -> GHC.Word.Word16
[GlobalId]
[Arity 1
Worker Swap.$wswap16Word16
NoCafRefs
Str: DmdType U(L)m]
Swap.swap16Word16 = __inline_me (\ (w :: GHC.Word.Word16) ->
case GHC.Word.Word16 w of w1 {
GHC.Word.W16# ww ->
case GHC.Word.Word16
Swap.$wswap16Word16 ww of ww1 { __DEFAULT ->
GHC.Word.W16# ww1
}
})
It looks like marking the function INLINE supresses the worker/wrapper
split, and when there is not INLINE and the function get split, only the
wrapper is eligible for inlining.
Brandon Moore
[1] (section 10.7.2 of the 6.4.1 User's guide says "In the earlier
phases of compilation, GHC inlines nothing that appears on the LHS of a
rule, because once you have substituted for something you can't match
against it (given the simple minded matching).")
[2] Maybe because the worker isn't annotation with __inline_me in the
Core? Maybe from the aside in 7.10.5 about and INLINE pragma preventing
any inlining in the RHS of the thing marked inline?
More information about the Haskell-Cafe
mailing list