{-# INLINE me_harder #-}
Duncan Coutts
duncan.coutts at worc.ox.ac.uk
Wed Apr 18 03:21:58 EDT 2007
So I'm trying to program the GHC term rewriting system again (ie the
mighty simplifier) and it's not doing what I want it to do without me
using unnecessarily large hammers.
The crux is that I have a simple function that I want to be inlined
early so that my other rules can match on the thing it expands into,
however despite the {-# INLINE #-} pragma, it doesn't get inlined until
a much later phase when it's too late for the rule to match.
So currently I'm forced to use {-# RULES #-} to achieve the same effect,
ie expanding the definition in an early phase. This feel like a hack of
course. So either the INLINE heuristics could be tweaked to make these
cases work or perhaps we should consider adding some other pragma for
when we don't want to go by heuristics but are instead deliberately
trying to do term rewriting.
Anyway, here's the example. It's binary deserialisation.
We have a couple important primitives:
read :: Int -> (Ptr Word8 -> IO a) -> Get a
and the applicative combinators (<$>) (<*>) for Get:
fmapGet :: (a -> b) -> Get a -> Get b
apGet :: Get (a -> b) -> Get a -> Get b
So that we'll be able to match on these, we delay their inlining:
{-# INLINE [0] apGet #-}
{-# INLINE [0] fmapGet #-}
{-# INLINE [0] read #-}
The important rules linking these are:
{-# RULES
"fmap/read" forall f n a.
f `fmapGet` read n a = ...
"read/read" forall n m f x.
read n f `apGet` read m x = ...
#-}
and finally we have:
word8 :: Get Word8
word8 = read 1 peek
{-# INLINE word8 #-}
So when we write something like:
foo :: Get (Word8,Word8,Word8)
foo = (,,) <$> Get.word8 <*> Get.word8 <*> Get.word8
we want to expand word8 into it's definition in terms of read and then
have the rules fire (which shares the bounds checks).
Looking at the simplifier iterations we get to this step:
==================== Simplifier phase 2, iteration 1 out of 4 ====================
PutTest.foo :: Get.Get (GHC.Word.Word8, GHC.Word.Word8, GHC.Word.Word8)
[Exported]
[]
PutTest.foo =
Get.apGet
@ GHC.Word.Word8
@ (GHC.Word.Word8, GHC.Word.Word8, GHC.Word.Word8)
(Get.apGet
@ GHC.Word.Word8
@ (GHC.Word.Word8 -> (GHC.Word.Word8, GHC.Word.Word8, GHC.Word.Word8))
(Get.fmapGet
@ GHC.Word.Word8
@ (GHC.Word.Word8
-> GHC.Word.Word8
-> (GHC.Word.Word8, GHC.Word.Word8, GHC.Word.Word8))
(Data.Tuple.(,,) @ GHC.Word.Word8 @ GHC.Word.Word8 @ GHC.Word.Word8)
Get.word8)
Get.word8)
Get.word8
Here is where we really must inline word8 or we're going to miss our
opportunity once apGet and fmapGet get inlined.
Sadly it doesn't inline it here (in phase 2), or in the next phase (in
fact it does nothing in phase 1 at all). We get all the way to phase 0
and then go and inline everything (including word8).
The behaviour is the same for 6.6 and 6.7.recentish (one month old).
If we hit word8 with a bigger hammer:
{-# RULES "inline word8" word8 = read 1 peekWord8 #-}
then the whole thing works perfectly and we get really nice STG code at
the end.
I was under the impression that GHC considered a function more
'interesting' if it was mentioned in the LHS of a rule as is the case
here. What is the right thing for me to do in this case? Just use the
rule to do the inlining?
The full source is here:
http://haskell.org/~duncan/binary/
Get.hs and GetTest.hs
(In the same dir is an example showing the opposite problem, GHC
inlining a function when I asked explicitly for it not to. I reported
that problem previously. Though I now realise I never filed a bug in
trac for that one. I'll file bugs for both.)
On the good news side, I'm getting excellent performance results for the
serialisation and judging from the STG code for the deserialisation I
expect it'll be great too.
Duncan
More information about the Glasgow-haskell-users
mailing list