Fun with GHC's optimiser
Simon Peyton-Jones
simonpj@microsoft.com
Fri, 8 Dec 2000 05:57:55 -0800
PArrays.$w$snewPArray
= \ ww :: PrelGHC.Int# w :: PrelBase.Int ->
case PrelGHC.newIntArray# @ PrelGHC.RealWorld ww PrelGHC.realWorld#
of wild { (# s2#, mba# #) ->
let {
eta :: PrelBase.Int
__A 0 __C
eta
= PrelBase.$wI# ww
} in
case PrelGHC.-# ww 1 of sat { __DEFAULT ->
case PrelGHC.># 0 sat of wild1 {
PrelBase.True ->
PArrays.$wPArray
@ PrelBase.Int eta (__coerce PrelGHC.ByteArray# mba#);
PrelBase.False ->
case w of w1 { PrelBase.I# ww1 ->
case PrelGHC.writeIntArray# @ PrelGHC.RealWorld mba# 0 ww1
s2#
of s2#1 { __DEFAULT ->
case PrelGHC.-# ww 1 of wild2 {
0 ->
PArrays.$wPArray
@ PrelBase.Int eta (__coerce
PrelGHC.ByteArray# mba#);
__DEFAULT ->
let {
a4 :: (PArrays.PArray PrelBase.Int)
__A 0 __C
a4 =
PArrays.$wPArray
@ PrelBase.Int eta (__coerce
PrelGHC.ByteArray# mba#) } in
__letrec {
$wgo :: (PrelGHC.Int#
-> PrelGHC.State# PrelGHC.RealWorld
-> (PrelGHC.State#
PrelGHC.RealWorld,
PArrays.PArray PrelBase.Int))
__A 2 __C
$wgo
= \ w2 :: PrelGHC.Int# w3 :: (PrelGHC.State#
PrelGHC.RealWorld) ->
case PrelGHC.writeIntArray# @
PrelGHC.RealWorld mba# w2 ww1 w3
of s2#2 { __DEFAULT ->
case PrelGHC.-# ww 1 of sat1 { __DEFAULT
->
case PrelGHC.==# w2 sat1 of wild11 {
PrelBase.True -> (# s2#2, a4 #);
PrelBase.False ->
case PrelGHC.+# w2 1 of sat2 {
__DEFAULT ->
$wgo sat2 s2#2
}
}
}
};
} in case $wgo 1 s2#1 of wild3 { (# ds, r #) -> r
}
}
}
}
}
}
}
| However, we came across one problem (read, lack of
| optimisation on GHC's part), which leads to tedious
| duplication of a lot of code in our array library.
| Basically, GHC does not recognise for tail recursive
| functions when certain arguments (accumulators maintained in
| a loop) can be unboxed. This leads to massive overheads in
| our code. Currently, we circumvent the inefficiency by
| having manually specialised versions of the loops for
| different accumulator types and using RULES to select them
| where appropriate (based on the type information). I will
| send you some example code illustrating the problem soon.
Yes please.
Meanwhile, in the head, if you specify -O2 you'll get better
code for some of your loops, namely the ones that repeatedly
evaluate a free variable every time round the loop. The
loop is duplicated and everything is nice. Here's a tiny example:
g :: Int -> Int
g x = let h 0 = 0
h y = if x==0 then y else h (y+1)
in
h 88
This now compiles to:
LC.g
= \ x :: PrelBase.Int ->
case x of wild { PrelBase.I# x1 ->
case x1 of wild1 {
0 -> PrelBase.$wI# 88;
__DEFAULT ->
__letrec {
$wh :: (PrelGHC.Int# -> PrelBase.Int)
__A 1 __C
$wh
= \ ww :: PrelGHC.Int# ->
case ww of wild2 {
0 -> LC.lit;
__DEFAULT ->
case PrelGHC.+# wild2 1 of sat { __DEFAULT
-> $wh sat }
};
} in $wh 89
}
}
You mentioned something like this before.
The amount of code duplication is controlled by -fliberate-case-threshold20
(say)
The default is set pretty low. And it all only happens with -O2.
All in the HEAD. The pass is call simplCore/LiberateCase.
Simon