[GHC] #13397: Optimise calls to tagToEnum#
GHC
ghc-devs at haskell.org
Wed Mar 8 13:15:56 UTC 2017
#13397: Optimise calls to tagToEnum#
-------------------------------------+-------------------------------------
Reporter: simonpj | Owner: (none)
Type: bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 8.0.1
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by simonpj):
I have committed these patches to branch `wip/spj-T13397`:
{{{
commit 43540c8c6b9e914f302c71213a71ab5c780be2ac
Author: Simon Peyton Jones <simonpj at microsoft.com>
Date: Wed Mar 8 11:05:53 2017 +0000
Improve code generation for conditionals
This patch in in preparation for the fix to Trac #13397
The code generator has a special case for
case tagToEnum (a>#b) of
False -> e1
True -> e2
but it was not doing nearly so well on
case a>#b of
DEFAULT -> e1
1# -> e2
This patch arranges to behave essentially identically in
both cases. In due course we can eliminate the special
case for tagToEnum#, once we've completed Trac #13397.
The changes are:
* Make CmmSink swizzle the order of a conditional where necessary;
see Note [Improving conditionals] in CmmSink
* Hack the general case of StgCmmExpr.cgCase so that it use
NoGcInAlts for conditionals. This doesn't seem right, but it's
the same choice as the tagToEnum version. Without it, code size
increases a lot (more heap checks).
There's a loose end here.
* Add comments in CmmOpt.cmmMachOpFoldM
commit e49f3154a5ceb1894414f4635579aeb3aa84054f
Author: Simon Peyton Jones <simonpj at microsoft.com>
Date: Wed Mar 8 10:26:47 2017 +0000
Re-engineer caseRules to add tagToEnum/dataToTag
See Note [Scrutinee Constant Folding] in SimplUtils
* Add cases for tagToEnum and dataToTag. This is the main new
bit. It allows the simplifier to remove the pervasive uses
of case tagToEnum (a > b) of
False -> e1
True -> e2
and replace it by the simpler
case a > b of
DEFAULT -> e1
1# -> e2
See Note [caseRules for tagToEnum]
and Note [caseRules for dataToTag] in PrelRules.
* This required some changes to the API of caseRules, and hence
to code in SimplUtils. See Note [Scrutinee Constant Folding]
in SimplUtils.
* Avoid duplication of work in the (unusual) case of
case BIG + 3# of b
DEFAULT -> e1
6# -> e2
Previously we got
case BIG of
DEFAULT -> let b = BIG + 3# in e1
3# -> let b = 6# in e2
Now we get
case BIG of b#
DEFAULT -> let b = b' + 3# in e1
3# -> let b = 6# in e2
* Avoid duplicated code in caseRules
A knock-on refactoring:
* Move Note [Word/Int underflow/overflow] to Literal, as
documentation to accompany mkMachIntWrap etc; and get
rid of PrelRuls.intResult' in favour of mkMachIntWrap
}}}
It's good stuff generally, so I'm quite keen to keep it. It does indeed
eliminate the annoying `tagToEnum#` stuff.
I get the `nofib` results below. There are some odd things happening,
which is why I have not committed to HEAD.
* I did not expect binary sizes to change, but the do wobble around a bit,
with a net tiny increase
* I did not expect allocations to change. I chased down the change in
`knights`: it was due to increased closure sizes. That in turn was due to
better CSE, which is a good thing (just made more live variables). So I
think I'm ok with that. Allocations sometimes go down too. Net zero.
* There are some troubling increases in execution time. Notably, `n-body`
really does run slower, repeatably. I think. I have no idea why. I think
the C-- code is the same... but perhaps we are somehow generating worse
assembly code.
{{{
Program Size Allocs Runtime Elapsed TotalMem
--------------------------------------------------------------------------------
anna -0.0% -0.7% 0.16 0.16 +0.0%
ansi +0.1% +0.0% 0.00 0.00 +0.0%
atom +0.2% +0.0% -2.1% -2.1% +0.0%
awards +0.2% +0.0% 0.00 0.00 +0.0%
banner -0.0% +0.0% 0.00 0.00 +0.0%
bernouilli +0.4% -0.0% -0.7% -0.9% +0.0%
binary-trees +0.5% -0.0% +4.6% +4.7% +0.0%
boyer +0.0% +0.0% 0.06 0.06 +0.0%
boyer2 -0.0% +0.0% 0.01 0.01 +0.0%
bspt +0.0% +0.0% 0.01 0.01 +0.0%
cacheprof +0.0% -0.0% +1.6% +2.3% -1.8%
calendar +0.0% +0.0% 0.00 0.00 +0.0%
cichelli -0.0% +0.0% 0.12 0.12 +0.0%
circsim +0.1% -0.0% +0.7% +0.7% +0.0%
clausify +0.2% +0.0% 0.05 0.05 +0.0%
comp_lab_zift -0.0% +0.0% +0.3% +0.2% +0.0%
compress -0.0% +0.0% -0.7% +0.4% +0.0%
compress2 +0.3% +0.0% +2.3% +2.4% +0.0%
constraints +0.0% +0.0% +3.2% +3.2% +0.0%
cryptarithm1 -0.0% +0.0% -9.0% -9.0% +0.0%
cryptarithm2 -0.0% +0.0% 0.01 0.01 +0.0%
cse -0.0% +0.0% 0.00 0.00 +0.0%
digits-of-e1 +0.4% +0.0% +2.9% +2.9% +0.0%
digits-of-e2 +0.3% +0.0% -2.1% -2.1% +0.0%
eliza -0.0% +0.0% 0.00 0.00 +0.0%
event +0.0% +0.0% +0.3% +0.3% +0.0%
exp3_8 +0.2% +0.0% +0.7% +0.7% +0.0%
expert +0.1% +0.0% 0.00 0.00 +0.0%
fannkuch-redux -0.0% -0.0% -1.1% -1.1% +0.0%
fasta +0.0% +0.0% +0.5% -0.2% +0.0%
fem +0.4% +0.0% 0.04 0.04 +0.0%
fft +0.2% -0.4% 0.06 0.06 +0.0%
fft2 +0.2% -0.1% 0.08 0.08 +0.0%
fibheaps +0.0% +0.0% 0.03 0.03 +0.0%
fish -0.0% +0.0% 0.02 0.02 +0.0%
fluid +0.2% +0.0% 0.01 0.01 +0.0%
fulsom +0.1% +0.0% +0.1% +0.0% +0.0%
gamteb +0.2% +0.0% 0.07 0.07 +0.0%
gcd +0.3% +0.0% 0.09 0.09 +0.0%
gen_regexps -0.0% +0.0% 0.00 0.00 +0.0%
genfft -0.1% -0.2% 0.06 0.06 +0.0%
gg +0.1% +0.0% 0.02 0.02 +0.0%
grep -0.1% +0.0% 0.00 0.00 +0.0%
hidden +0.4% +0.0% +2.8% +2.9% +0.0%
hpg +0.1% -0.0% -1.9% -2.1% +0.0%
ida +0.0% +0.0% 0.10 0.10 +0.0%
infer -0.0% +0.0% 0.10 0.10 +0.0%
integer +0.5% +0.0% +1.6% +1.6% +0.0%
integrate +0.2% +0.0% +4.8% +5.0% +0.0%
k-nucleotide +0.1% -0.1% -1.5% -1.6% +0.0%
kahan +0.2% +0.0% +1.6% +1.6% +0.0%
knights +0.2% +1.3% 0.01 0.01 +0.0%
lambda +0.0% +0.0% +6.5% +6.5% +0.0%
last-piece -0.1% +0.3% +2.4% +2.5% +0.0%
lcss +0.0% +0.0% +2.7% +2.7% +0.0%
life +0.1% +0.0% +0.8% +1.0% +0.0%
lift -0.0% +0.0% 0.00 0.00 +0.0%
listcompr -0.0% +0.0% 0.18 0.18 +0.0%
listcopy -0.0% +0.0% 0.19 0.19 +0.0%
maillist -0.0% -0.0% 0.08 0.09 -5.3%
mandel +0.5% +0.0% 0.13 0.13 +0.0%
mandel2 -0.0% +0.0% 0.01 0.01 +0.0%
minimax -0.0% +0.0% 0.01 0.01 +0.0%
mkhprog -0.0% +0.0% 0.00 0.00 +0.0%
multiplier +0.0% +0.0% 0.19 0.19 +0.0%
n-body +0.2% +0.0% +14.6% +14.6% +0.0%
nucleic2 +0.2% +0.0% 0.11 0.11 +0.0%
para -0.0% +0.0% -1.5% -1.5% +0.0%
paraffins -0.1% -0.1% 0.19 0.20 +0.0%
parser -0.6% +0.0% 0.04 0.04 +0.0%
parstof -0.0% +0.0% 0.01 0.01 +0.0%
pic -0.3% +1.1% 0.01 0.01 +0.0%
pidigits +0.3% +0.0% -0.0% -0.0% +0.0%
power +0.2% +0.0% +2.0% +2.2% +0.0%
pretty +0.3% +0.0% 0.00 0.00 +0.0%
primes +0.0% +0.0% 0.11 0.11 +0.0%
primetest +0.4% +0.0% 0.13 0.13 +0.0%
prolog +0.2% +0.0% 0.00 0.00 +0.0%
puzzle -0.0% +0.0% 0.20 0.20 +0.0%
queens +0.0% +0.0% 0.02 0.02 +0.0%
reptile -0.1% +0.0% 0.02 0.02 +0.0%
reverse-complem -0.0% +0.0% +2.4% +2.4% +0.0%
rewrite +0.0% +0.0% 0.03 0.03 +0.0%
rfib +0.5% +0.0% 0.03 0.03 +0.0%
rsa +0.4% +0.0% 0.03 0.03 +0.0%
scc -0.0% +0.0% 0.00 0.00 +0.0%
sched +0.0% +0.0% 0.03 0.03 +0.0%
scs +0.0% +0.8% +7.6% +7.5% +0.0%
simple +0.1% +0.0% +4.9% +5.0% +0.0%
solid +0.2% +0.0% 0.19 0.19 +0.0%
sorting -0.0% +0.0% 0.00 0.00 +0.0%
spectral-norm +0.2% +0.0% +1.5% +1.5% +0.0%
sphere +0.0% +0.0% 0.08 0.08 +0.0%
symalg +0.4% +0.0% 0.01 0.01 +0.0%
tak +0.0% +0.0% 0.02 0.02 +0.0%
transform -0.0% +0.0% -4.5% -4.5% +0.0%
treejoin -0.0% +0.0% +16.7% +16.6% +0.0%
typecheck -0.0% +0.0% -1.4% -1.3% +0.0%
veritas -0.1% +0.0% 0.00 0.00 +0.0%
wang +0.2% +0.0% 0.17 0.17 +0.0%
wave4main +0.0% +0.0% +1.9% +1.7% +0.0%
wheel-sieve1 +0.0% +0.0% +1.5% +1.5% +0.0%
wheel-sieve2 +0.0% +0.0% +0.6% +0.6% +0.0%
x2n1 +0.1% +0.0% 0.01 0.01 +0.0%
--------------------------------------------------------------------------------
Min -0.6% -0.7% -9.0% -9.0% -5.3%
Max +0.5% +1.3% +16.7% +16.6% +0.0%
Geometric Mean +0.1% +0.0% +1.6% +1.6% -0.1%
}}}
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/13397#comment:1>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list