Fun with GHC's optimiser
Simon Peyton-Jones
simonpj@microsoft.com
Thu, 2 Nov 2000 04:46:41 -0800
I can never resist messages like these, even when I'm meant
to be doing other things. It's very helpful when people offer
fairly precise performance-bug reports. Thanks!
| I am wondering whether there is a particular reason why the
| optimiser doesn't pull the
|
| (1) a = NO_CCS PArray! [wild1 mba#];
This one is a definite bug. It turns out that the head of the
before-ghci-branch doesn't have this bug, so I'm disinclined
to investigate it further.
| (2) case w of wild3 {
| I# e# ->
|
| As for (2), the loop would be nice and straight if that
| unboxing where outside of the loop - as it is, we break the
| pipeline once per iteration it seems
This one is a bit harder. Basically we want to make a wrapper
for a recursive function if it's sure to evaluate its free variables.
In fact the 'liberate-case' pass (which isn't switched on in 4.08)
is meant to do just this. It's in simplCore/LiberateCase.lhs,
and it's not very complicated. I've just tried it and it doesn't seem
to have the desired effect, but I'm sure that's for a boring reason.
If anyone would like to fix it, go ahead!
(You can't just say '-fliberate-case' on the command line to make
it go; you have to add -fliberate-case at a sensible point to the
minusOflags in driver/Main.hs.)
Incidentally, you'll find that -ddump-simpl gives you a dump that
is pretty close to STG and usually much more readable. Most
performance bugs show up there. -dverbose-simpl gives you more
clues about what is happening where.
| Also if somebody is looking at the attached source, I was
| wondering why, when I use the commented out code in
| `newPArray', I get a lot worse code (the STG code is in a
| comment at the end of the file). In particular, the lambda
| abstraction is not inlined, whereas `fill' gets inlined into
| the code of which the dump is above. Is it because the
| compiler has a lot harder time with explicit recursion than
| with fold/build? If so, the right RULES magic should allow
| me to do the same for my own recursively defined
| combinators, shouldn't it?
I couldn't figure out exactly what you meant. The only commented
out code is STG code. Maybe send a module with the actual
source you are bothered about.
S