[GHC] #14737: Improve performance of Simplify.simplCast
GHC
ghc-devs at haskell.org
Tue Apr 3 10:59:57 UTC 2018
#14737: Improve performance of Simplify.simplCast
-------------------------------------+-------------------------------------
Reporter: tdammers | Owner: (none)
Type: bug | Status: patch
Priority: normal | Milestone:
Component: Compiler | Version: 8.2.2
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
Type of failure: Compile-time | Unknown/Multiple
performance bug | Test Case:
Blocked By: | Blocking:
Related Tickets: #11735 #14683 | Differential Rev(s): Phab:D4385
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by tdammers):
Replying to [comment:10 simonpj]:
> Try getting rid of the first equation for `puchCoTyArg`
> {{{
> pushCoTyArg co ty
> | tyL `eqType` tyR
> = Just (ty, mkRepReflCo (piResultTy tyR ty))
> }}}
> This is another big pile of type-equalities, rather like calling
`isReflexiveCo` at the wrong moment.
>
> Claim: if it happens that `tyL` = `tyR`, but we go ahead with all that
`mkCoherenceLeftCo` stuff anyway, then the coercion optimiser will get rid
of it later. '''Richard''': will it?
>
> But try that change anyway. NO WAY should `pushCoTyArg` take 54% of
compile time!
Plain out removing that case branch gets us down by another 4 seconds:
{{{
Tue Apr 3 11:09 2018 Time and Allocation Profiling Report
(Final)
ghc-stage2 +RTS -p -RTS -B/home/tobias/well-
typed/devel/ghc/T14737/inplace/lib ./cases/Grammar.hs -o ./a -fforce-
recomp
total time = 7.86 secs (7864 ticks @ 1000 us, 1
processor)
total alloc = 10,150,661,432 bytes (excludes profiling overheads)
COST CENTRE MODULE SRC
%time %alloc
mkInstCo CoreOpt compiler/coreSyn/CoreOpt.hs:982:33-84
31.7 40.6
tc_rn_src_decls TcRnDriver
compiler/typecheck/TcRnDriver.hs:(494,4)-(556,7) 20.6 20.4
CoreTidy HscMain compiler/main/HscMain.hs:1253:27-67
7.2 5.5
SimplTopBinds SimplCore compiler/simplCore/SimplCore.hs:770:39-74
6.6 4.6
simplCast Simplify
compiler/simplCore/Simplify.hs:(1213,5)-(1215,37) 3.7 3.5
zonkTopDecls TcRnDriver
compiler/typecheck/TcRnDriver.hs:(445,16)-(446,43) 3.5 3.1
deSugar HscMain compiler/main/HscMain.hs:511:7-44
2.4 1.9
coercionKind Coercion compiler/types/Coercion.hs:1716:3-7
1.9 4.6
isReflexiveCo Simplify compiler/simplCore/Simplify.hs:1260:40-55
1.8 1.4
Parser HscMain compiler/main/HscMain.hs:(316,5)-(384,20)
1.8 2.3
StgCmm HscMain compiler/main/HscMain.hs:(1428,13)-(1429,62)
1.6 0.7
}}}
I've added a few more SCC's to trace more deeply into `simplCast`, which
is why `simplCast` itself has seemingly dropped to 3.7% - this isn't
accurate, because `mkInstCo` makes up most of the rest of the `simplCast`
call.
So I suggest committing the branch deletion (assuming that it won't break
anything).
From here, I'm not 100% sure which is more promising: digging into
`mkInstCo` to see if we can make it more efficient, or looking at
`simplCast` to see if we can make it call `mkInstCo` less often.
Also:
> Note that ​Phab:D4395 currently removes the piResultTy from that case,
but it's quite possible that the eqType call is what's taking up the time.
The full profile from before the deletion (which, unfortunately, I no
longer have around) clearly shows that `eqType` is what consumes all that
time, not `piResultTy`.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/14737#comment:12>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list