# [Haskell] Expecting more inlining for bit shifting

Simon Peyton-Jones simonpj at microsoft.com
Mon Oct 9 03:47:07 EDT 2006

```[Redirecting to GHC users.]

Turns out that 'shift' is just too big to be inlined.  (It's only called
once, but you have exported it too.)

You can see GHC's inlining decisions by saying -ddump-inlinings.

To make GHC keener to inline, use an INLINE pragma, or increase the
inlining size threshold e.g. -funfolding-threshold=12

Simon

| -----Original Message-----
On Behalf Of
| roconnor at theorem.ca
| Sent: 09 October 2006 00:41
| Subject: [Haskell] Expecting more inlining for bit shifting
|
| Consider the following GHC code:
|
| module Main where
|
| import GHC.Word
| import GHC.Base
| import GHC.Prim
| import Random
|
| a `shiftRLT` b | b >=# 32# = int2Word# 0#
|                 | otherwise = a `uncheckedShiftRL#` b
|
| (W32# x#) `shift` (I# i#)
|          | i# >=# 0#            = W32# (narrow32Word# (x# `shiftL#`
i#))
|          | otherwise            = W32# (x# `shiftRLT` negateInt# i#)
|
| x `shiftR`  i = x `shift`  (-i)
|
| shift7 x = x `shiftR` 7
| shift6 (W32# x) = (W32# (x `uncheckedShiftRL#` 6#))
|
| main = do
|    xs <- sequence (replicate 1000000
|            (fmap (shift7 . fromIntegral) (randomIO::IO Int)))
|    print (sum xs)
|
| I have copied the definition of `shiftR` for Word32 into this file.
|
| Suppose we want to shift a series of numbers by 7 bits.  One would
expect
| GHC's inliner to notice that (-7) is indeed not greater than 0, and
| eliminate the branch in the definition of `shift`.  Further one would
| expect GHC to notice that 7 is indeed not gtreater than 32, and
eliminate
| the branch in shiftRLT.  Thus one would expect the code generated by
using
| shift7 to be identical to that being generated by shfit6 (with 7
replaced
| by 6).
|
| But this appears not to be the case.  The code generated for shift7
(if I
| can read the C code correctly) is:
| Sp[-1] = (-0x7U);
| Sp[-2] = R1.p[1];
| *Sp = (W_)&s2za_info;
| Sp=Sp-2;
| JMP_((W_)&Main_zdwshift_info);
|
| while the code generated for shift6 is the lovely:
|
| Hp=Hp+2;
| if ((W_)Hp > (W_)HpLim) goto _c2Aa;
| _s2xq = (R1.p[1]) >> 0x6U;
| Hp[-1] = (W_)&GHCziWord_W32zh_con_info;
| *Hp = _s2xq;
| R1.p=Hp-1;
| Sp=Sp+1;
| JMP_(*Sp);
| _c2Aa:
| HpAlloc = 0x8U;
| JMP_(stg_gc_enter_1);
|
| My question is, why the discrepency?
|
| --
| Russell O'Connor                                      <http://r6.ca/>
| ``All talk about `theft,''' the general counsel of the American
Graphophone
| Company wrote, ``is the merest claptrap, for there exists no property
in
| ideas musical, literary or artistic, except as defined by statute.''
| _______________________________________________