[GHC] #8814: 7.8 optimizes attoparsec improperly
GHC
ghc-devs at haskell.org
Fri Feb 21 17:59:54 UTC 2014
#8814: 7.8 optimizes attoparsec improperly
--------------------------------------------+------------------------------
Reporter: joelteon | Owner:
Type: bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.8.1-rc1
Resolution: | Keywords:
Operating System: MacOS X | Architecture: x86_64
Type of failure: Runtime performance bug | (amd64)
Test Case: | Difficulty: Unknown
Blocking: | Blocked By:
| Related Tickets:
--------------------------------------------+------------------------------
Comment (by simonpj):
I have not had any time to devote to this. I tried
{{{
ghc -O T8814.hs -ddump-simpl -o T8814
}}}
with and without `-fno-full-laziness`. Indeed I see the perf difference.
The Core from `-ddump-simpl` looks very different. Inside `Main.$wa`
you'll see a call to `runSTRep`. The function to which `runSTRep` is
applied looks very different.
* Without full laziness, it consists of a call to `newArray#` followed by
a couple of `memcpy` calls
* With full laziness, it has a rather complicated local recursive
function that allocates a LOT of memory.
I have no idea why. I think it must be to do with optimisations being done
by RULES in the text library. If I add `-ddump-rule-firings` and grep for
`TEXT` in the rule names, I get
{{{
-- With full laziness
Rule fired: TEXT append -> fused
Rule fired: TEXT append -> fused
Rule fired: TEXT append -> fused
Rule fired: TEXT append -> fused
Rule fired: TEXT append -> unfused
Rule fired: TEXT tail -> unfused
Rule fired: TEXT tail -> unfused
-- Without full laziness
Rule fired: TEXT append -> fused
Rule fired: TEXT append -> fused
Rule fired: TEXT append -> fused
Rule fired: TEXT append -> fused
Rule fired: TEXT append -> unfused
Rule fired: TEXT append -> unfused
Rule fired: TEXT append -> unfused
Rule fired: TEXT tail -> unfused
Rule fired: TEXT tail -> unfused
Rule fired: TEXT append -> unfused
}}}
So there is clearly a difference. Should that difference have such a
massive performance impact? Ask the author of the text library! Why does
full laziness have the effect? Well if you have `(\x. map (f x) (map g
ys))`, say, full laziness may float out the `map g ys` and then the
map/map fusion won't happen.
At this point I hope that someone else will take over debugging to find
out more.
Simon
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/8814#comment:8>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list