[Haskell] Expecting more inlining for bit shifting
Simon PeytonJones
simonpj at microsoft.com
Mon Oct 9 03:47:07 EDT 2006
[Redirecting to GHC users.]
Turns out that 'shift' is just too big to be inlined. (It's only called
once, but you have exported it too.)
You can see GHC's inlining decisions by saying ddumpinlinings.
To make GHC keener to inline, use an INLINE pragma, or increase the
inlining size threshold e.g. funfoldingthreshold=12
Simon
 Original Message
 From: haskellbounces at haskell.org [mailto:haskellbounces at haskell.org]
On Behalf Of
 roconnor at theorem.ca
 Sent: 09 October 2006 00:41
 To: haskell at haskell.org
 Subject: [Haskell] Expecting more inlining for bit shifting

 Consider the following GHC code:

 module Main where

 import GHC.Word
 import GHC.Base
 import GHC.Prim
 import Random

 a `shiftRLT` b  b >=# 32# = int2Word# 0#
  otherwise = a `uncheckedShiftRL#` b

 (W32# x#) `shift` (I# i#)
  i# >=# 0# = W32# (narrow32Word# (x# `shiftL#`
i#))
  otherwise = W32# (x# `shiftRLT` negateInt# i#)

 x `shiftR` i = x `shift` (i)

 shift7 x = x `shiftR` 7
 shift6 (W32# x) = (W32# (x `uncheckedShiftRL#` 6#))

 main = do
 xs < sequence (replicate 1000000
 (fmap (shift7 . fromIntegral) (randomIO::IO Int)))
 print (sum xs)

 I have copied the definition of `shiftR` for Word32 into this file.

 Suppose we want to shift a series of numbers by 7 bits. One would
expect
 GHC's inliner to notice that (7) is indeed not greater than 0, and
 eliminate the branch in the definition of `shift`. Further one would
 expect GHC to notice that 7 is indeed not gtreater than 32, and
eliminate
 the branch in shiftRLT. Thus one would expect the code generated by
using
 shift7 to be identical to that being generated by shfit6 (with 7
replaced
 by 6).

 But this appears not to be the case. The code generated for shift7
(if I
 can read the C code correctly) is:
 Sp[1] = (0x7U);
 Sp[2] = R1.p[1];
 *Sp = (W_)&s2za_info;
 Sp=Sp2;
 JMP_((W_)&Main_zdwshift_info);

 while the code generated for shift6 is the lovely:

 Hp=Hp+2;
 if ((W_)Hp > (W_)HpLim) goto _c2Aa;
 _s2xq = (R1.p[1]) >> 0x6U;
 Hp[1] = (W_)&GHCziWord_W32zh_con_info;
 *Hp = _s2xq;
 R1.p=Hp1;
 Sp=Sp+1;
 JMP_(*Sp);
 _c2Aa:
 HpAlloc = 0x8U;
 JMP_(stg_gc_enter_1);

 My question is, why the discrepency?

 
 Russell O'Connor <http://r6.ca/>
 ``All talk about `theft,''' the general counsel of the American
Graphophone
 Company wrote, ``is the merest claptrap, for there exists no property
in
 ideas musical, literary or artistic, except as defined by statute.''
 _______________________________________________
 Haskell mailing list
 Haskell at haskell.org
 http://www.haskell.org/mailman/listinfo/haskell
More information about the Glasgowhaskellusers
mailing list