[GHC] #13701: GHCi 2x slower without -keep-tmp-files
GHC
ghc-devs at haskell.org
Mon May 15 12:15:47 UTC 2017
#13701: GHCi 2x slower without -keep-tmp-files
-------------------------------------+-------------------------------------
Reporter: niteria | Owner: (none)
Type: task | Status: new
Priority: normal | Milestone:
Component: GHCi | Version: 8.3
Keywords: | Operating System: Unknown/Multiple
Architecture: | Type of failure: Compile-time
Unknown/Multiple | performance bug
Test Case: | Blocked By:
Blocking: | Related Tickets:
Differential Rev(s): | Wiki Page:
-------------------------------------+-------------------------------------
In D3562, I've observed that -keep-tmp-files makes :load 3x faster on my
test case.
I can't share my test case, but I've found a way to approximate it with
`MultiLayerModules` I just added in D3575.
Here are the steps:
{{{
# in ghc top dir
$ mkdir tmp
$ cd tmp
$ cp ../testsuite/tests/perf/compiler/genMultiLayerModules .
# edit genMultiLayerModules to say DEPTH=0, WIDTH=5000
$ ./genMultiLayerModules
$ echo ':load MultiLayerModules' | ../inplace/bin/ghc-stage2 --interactive
+RTS -s
11,132,224,952 bytes allocated in the heap
1,004,238,408 bytes copied during GC
185,091,216 bytes maximum residency (14 sample(s))
2,813,504 bytes maximum slop
365 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max
pause
Gen 0 706 colls, 0 par 0.907s 0.906s 0.0013s
0.0125s
Gen 1 14 colls, 0 par 0.607s 0.606s 0.0433s
0.2244s
TASKS: 4 (1 bound, 3 peak workers (3 total), using -N1)
SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)
INIT time 0.001s ( 0.000s elapsed)
MUT time 20.219s ( 20.493s elapsed)
GC time 1.514s ( 1.513s elapsed)
EXIT time 0.000s ( 0.005s elapsed)
Total time 21.733s ( 22.010s elapsed)
Alloc rate 550,585,275 bytes per MUT second
Productivity 93.0% of total user, 93.1% of total elapsed
$ echo ':load MultiLayerModules' | ../inplace/bin/ghc-stage2 --interactive
-keep-tmp-files +RTS -s
4,603,831,672 bytes allocated in the heap
971,623,904 bytes copied during GC
184,019,808 bytes maximum residency (14 sample(s))
2,262,680 bytes maximum slop
365 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max
pause
Gen 0 448 colls, 0 par 0.724s 0.723s 0.0016s
0.0321s
Gen 1 14 colls, 0 par 0.621s 0.620s 0.0443s
0.2242s
TASKS: 4 (1 bound, 3 peak workers (3 total), using -N1)
SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)
INIT time 0.001s ( 0.000s elapsed)
MUT time 7.966s ( 8.202s elapsed)
GC time 1.345s ( 1.344s elapsed)
EXIT time 0.000s ( 0.004s elapsed)
Total time 9.312s ( 9.550s elapsed)
Alloc rate 577,938,762 bytes per MUT second
Productivity 85.5% of total user, 85.9% of total elapsed
}}}
So it's 2x slower and allocates 2.5x more.
Profiling pointed to
https://phabricator.haskell.org/diffusion/GHC/browse/master/compiler/main/SysTools.hs;8bf50d5026f92eb5a6768eb2ac38479802da1411$1074
We're creating `dont_delete_set` a lot.
Looks like this was improved in D3111 recently.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/13701>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list