[GHC] #2289: Needless reboxing of values when returning from a tight loop
GHC
ghc-devs at haskell.org
Sat Feb 20 02:07:05 UTC 2016
#2289: Needless reboxing of values when returning from a tight loop
-------------------------------------+-------------------------------------
Reporter: dons | Owner:
Type: bug | Status: new
Priority: lowest | Milestone:
Component: Compiler | Version: 6.8.2
Resolution: | Keywords: boxing,
| loops, performance
Operating System: Unknown/Multiple | Architecture:
Type of failure: Runtime | Unknown/Multiple
performance bug | Test Case:
Blocked By: | Blocking:
Related Tickets: #2387,#1600 | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by twhitehead):
I was just going to open a ticket to report that I had discovered that the
compiler can't unbox multiple return values.
That is, anytime you create a non-inlined function (such as a loop) that
returns multiple values, you must pay the boxing and unboxing penalty.
My reading of these tickets (#1600, #2289, and #2387) though is that this
is already known. I'll just add a simple example here
{{{
#!haskell
main :: IO ()
main = case loop2 100 (10,10) of (au,ad) -> print (au - ad)
loop2 :: Int -> (Int,Int) -> (Int,Int)
loop2 n (au,ad) | n > 0 = loop2 (n-1) (au+1,ad-1)
| otherwise = (au,ad)
}}}
yielding (with ''-ddump-simpl -dsuppress-all -dno-suppress-type-
signatures'')
{{{
#!haskell
Rec {
main_$s$wloop2 [Occ=LoopBreaker]
:: Int# -> Int# -> Int# -> (# Int, Int #)
[GblId, Arity=3, Caf=NoCafRefs, Str=DmdType <L,U><L,U><L,U>]
main_$s$wloop2 =
\ (sc_s3vn :: Int#) (sc1_s3vo :: Int#) (sc2_s3vp :: Int#) ->
case tagToEnum# (># sc_s3vn 0) of _ [Occ=Dead] {
False -> (# I# sc1_s3vo, I# sc2_s3vp #);
True ->
main_$s$wloop2 (-# sc_s3vn 1) (+# sc1_s3vo 1) (-# sc2_s3vp 1)
}
end Rec }
main1 :: State# RealWorld -> (# State# RealWorld, () #)
[GblId,
Arity=1,
Str=DmdType <L,U>,
Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True,
WorkFree=True, Expandable=True, Guidance=IF_ARGS [0] 171 0}]
main1 =
\ (eta_B1 [OS=OneShot] :: State# RealWorld) ->
case main_$s$wloop2 100 10 10
of _ [Occ=Dead] { (# ww1_s3uF, ww2_s3uG #) ->
hPutStr2
stdout
(case ww1_s3uF of _ [Occ=Dead] { I# x_a1d4 ->
case ww2_s3uG of _ [Occ=Dead] { I# y_a1d8 ->
case $wshowSignedInt 0 (-# x_a1d4 y_a1d8) ([])
of _ [Occ=Dead] { (# ww5_a1xQ, ww6_a1xR #) ->
: ww5_a1xQ ww6_a1xR
}
}
})
True
eta_B1
}
}}}
Cheers! -Tyson
PS: The perfect performance storm comes when you have a very fast inner
loop returning multiple values to an outer loop that evaluates the inner
loop many many times.
* The inner loop can't be inlined because it is a loop.
* The return values from the inner loop must be boxed and unboxed because
GHC doesn't seem to be able to do otherwise when there are multiple return
values.
* The inner loop being very fast means the relative cost of boxing and
unboxing is high.
* The outer loop repeating the inner loop many many times turns the high
relative cost into a high absolute cost.
In my specific case I was processing around 10G worth of text records.
The inner loop found the next field separator. The outer loop built the
data records. It was a nasty hit.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/2289#comment:44>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list