[GHC] #2289: Needless reboxing of values when returning from a tight loop

GHC ghc-devs at haskell.org
Sat Feb 20 02:07:05 UTC 2016


#2289: Needless reboxing of values when returning from a tight loop
-------------------------------------+-------------------------------------
        Reporter:  dons              |                Owner:
            Type:  bug               |               Status:  new
        Priority:  lowest            |            Milestone:
       Component:  Compiler          |              Version:  6.8.2
      Resolution:                    |             Keywords:  boxing,
                                     |  loops, performance
Operating System:  Unknown/Multiple  |         Architecture:
 Type of failure:  Runtime           |  Unknown/Multiple
  performance bug                    |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:  #2387,#1600       |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by twhitehead):

 I was just going to open a ticket to report that I had discovered that the
 compiler can't unbox multiple return values.

 That is, anytime you create a non-inlined function (such as a loop) that
 returns multiple values, you must pay the boxing and unboxing penalty.

 My reading of these tickets (#1600, #2289, and #2387) though is that this
 is already known.  I'll just add a simple example here

 {{{
 #!haskell
 main :: IO ()
 main = case loop2 100 (10,10) of (au,ad) -> print (au - ad)

 loop2 :: Int -> (Int,Int) -> (Int,Int)
 loop2 n (au,ad) | n > 0     = loop2 (n-1) (au+1,ad-1)
                 | otherwise = (au,ad)
 }}}

 yielding (with ''-ddump-simpl -dsuppress-all -dno-suppress-type-
 signatures'')

 {{{
 #!haskell
 Rec {
 main_$s$wloop2 [Occ=LoopBreaker]
   :: Int# -> Int# -> Int# -> (# Int, Int #)
 [GblId, Arity=3, Caf=NoCafRefs, Str=DmdType <L,U><L,U><L,U>]
 main_$s$wloop2 =
   \ (sc_s3vn :: Int#) (sc1_s3vo :: Int#) (sc2_s3vp :: Int#) ->
     case tagToEnum# (># sc_s3vn 0) of _ [Occ=Dead] {
       False -> (# I# sc1_s3vo, I# sc2_s3vp #);
       True ->
         main_$s$wloop2 (-# sc_s3vn 1) (+# sc1_s3vo 1) (-# sc2_s3vp 1)
     }
 end Rec }

 main1 :: State# RealWorld -> (# State# RealWorld, () #)
 [GblId,
  Arity=1,
  Str=DmdType <L,U>,
  Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True,
          WorkFree=True, Expandable=True, Guidance=IF_ARGS [0] 171 0}]
 main1 =
   \ (eta_B1 [OS=OneShot] :: State# RealWorld) ->
     case main_$s$wloop2 100 10 10
     of _ [Occ=Dead] { (# ww1_s3uF, ww2_s3uG #) ->
     hPutStr2
       stdout
       (case ww1_s3uF of _ [Occ=Dead] { I# x_a1d4 ->
        case ww2_s3uG of _ [Occ=Dead] { I# y_a1d8 ->
        case $wshowSignedInt 0 (-# x_a1d4 y_a1d8) ([])
        of _ [Occ=Dead] { (# ww5_a1xQ, ww6_a1xR #) ->
        : ww5_a1xQ ww6_a1xR
        }
        }
        })
       True
       eta_B1
     }
 }}}

 Cheers!  -Tyson

 PS:  The perfect performance storm comes when you have a very fast inner
 loop returning multiple values to an outer loop that evaluates the inner
 loop many many times.

 * The inner loop can't be inlined because it is a loop.
 * The return values from the inner loop must be boxed and unboxed because
 GHC doesn't seem to be able to do otherwise when there are multiple return
 values.

 * The inner loop being very fast means the relative cost of boxing and
 unboxing is high.
 * The outer loop repeating the inner loop many many times turns the high
 relative cost into a high absolute cost.

 In my specific case I was processing around 10G worth of text records.
 The inner loop found the next field separator.  The outer loop built the
 data records.  It was a nasty hit.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/2289#comment:44>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list