[Haskell] Expecting more inlining for bit shifting
Simon Peyton-Jones
simonpj at microsoft.com
Mon Oct 9 03:47:07 EDT 2006
[Redirecting to GHC users.]
Turns out that 'shift' is just too big to be inlined. (It's only called
once, but you have exported it too.)
You can see GHC's inlining decisions by saying -ddump-inlinings.
To make GHC keener to inline, use an INLINE pragma, or increase the
inlining size threshold e.g. -funfolding-threshold=12
Simon
| -----Original Message-----
| From: haskell-bounces at haskell.org [mailto:haskell-bounces at haskell.org]
On Behalf Of
| roconnor at theorem.ca
| Sent: 09 October 2006 00:41
| To: haskell at haskell.org
| Subject: [Haskell] Expecting more inlining for bit shifting
|
| Consider the following GHC code:
|
| module Main where
|
| import GHC.Word
| import GHC.Base
| import GHC.Prim
| import Random
|
| a `shiftRLT` b | b >=# 32# = int2Word# 0#
| | otherwise = a `uncheckedShiftRL#` b
|
| (W32# x#) `shift` (I# i#)
| | i# >=# 0# = W32# (narrow32Word# (x# `shiftL#`
i#))
| | otherwise = W32# (x# `shiftRLT` negateInt# i#)
|
| x `shiftR` i = x `shift` (-i)
|
| shift7 x = x `shiftR` 7
| shift6 (W32# x) = (W32# (x `uncheckedShiftRL#` 6#))
|
| main = do
| xs <- sequence (replicate 1000000
| (fmap (shift7 . fromIntegral) (randomIO::IO Int)))
| print (sum xs)
|
| I have copied the definition of `shiftR` for Word32 into this file.
|
| Suppose we want to shift a series of numbers by 7 bits. One would
expect
| GHC's inliner to notice that (-7) is indeed not greater than 0, and
| eliminate the branch in the definition of `shift`. Further one would
| expect GHC to notice that 7 is indeed not gtreater than 32, and
eliminate
| the branch in shiftRLT. Thus one would expect the code generated by
using
| shift7 to be identical to that being generated by shfit6 (with 7
replaced
| by 6).
|
| But this appears not to be the case. The code generated for shift7
(if I
| can read the C code correctly) is:
| Sp[-1] = (-0x7U);
| Sp[-2] = R1.p[1];
| *Sp = (W_)&s2za_info;
| Sp=Sp-2;
| JMP_((W_)&Main_zdwshift_info);
|
| while the code generated for shift6 is the lovely:
|
| Hp=Hp+2;
| if ((W_)Hp > (W_)HpLim) goto _c2Aa;
| _s2xq = (R1.p[1]) >> 0x6U;
| Hp[-1] = (W_)&GHCziWord_W32zh_con_info;
| *Hp = _s2xq;
| R1.p=Hp-1;
| Sp=Sp+1;
| JMP_(*Sp);
| _c2Aa:
| HpAlloc = 0x8U;
| JMP_(stg_gc_enter_1);
|
| My question is, why the discrepency?
|
| --
| Russell O'Connor <http://r6.ca/>
| ``All talk about `theft,''' the general counsel of the American
Graphophone
| Company wrote, ``is the merest claptrap, for there exists no property
in
| ideas musical, literary or artistic, except as defined by statute.''
| _______________________________________________
| Haskell mailing list
| Haskell at haskell.org
| http://www.haskell.org/mailman/listinfo/haskell
More information about the Glasgow-haskell-users
mailing list