[GHC] #8900: unordered-containers 16% slower in HEAD vs 7.6.3
GHC
ghc-devs at haskell.org
Sat Mar 15 17:19:07 UTC 2014
#8900: unordered-containers 16% slower in HEAD vs 7.6.3
--------------------------------------------+------------------------------
Reporter: tibbe | Owner:
Type: bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.9
Resolution: | Keywords:
Operating System: MacOS X | Architecture: x86_64
Type of failure: Runtime performance bug | (amd64)
Test Case: | Difficulty: Unknown
Blocking: | Blocked By:
| Related Tickets:
--------------------------------------------+------------------------------
Description changed by tibbe:
Old description:
> I ran a simple benchmark that exercises [https://github.com/tibbe
> /unordered-containers/blob/master/Data/HashMap/Base.hs#L303
> Data.HashMap.Lazy.insert]. It's 16% slower using HEAD compared to using
> 7.6.3. The generated Core is the same, but the new codegen generates
> substantially different Cmm.
>
> '''Steps to reproduce'''
>
> 1. Download the attached `HashMapInsert.hs` benchmark.
> 2. Install unordered-containers with both 7.6.3 and HEAD:
>
> {{{
> $ cabal install -w ghc-7.6.3 unordered-containers-0.2.3.3
> $ cabal install -w inplace/bin/ghc-stage2 unordered-containers-0.2.3.3
> }}}
>
> 3. Compile the benchmark with both compilers:
>
> {{{
> $ ghc-7.6.3 -O2 HashMapInsert.hs
> $ mv HashMapInsert HashMapInsertOld
> $ inplace/bin/ghc-stage2 -O2 HashMapInsert.hs
> $ mv HashMapInsert HashMapInsertNew
> }}}
>
> '''Results (best of 3 runs)'''
>
> 7.6.3
>
> {{{
> $ ./HashMapInsertOld +RTS -s
> 1,191,223,528 bytes allocated in the heap
> 141,978,520 bytes copied during GC
> 37,811,840 bytes maximum residency (8 sample(s))
> 22,378,432 bytes maximum slop
> 99 MB total memory in use (0 MB lost due to fragmentation)
>
> Tot time (elapsed) Avg pause Max
> pause
> Gen 0 2277 colls, 0 par 0.06s 0.06s 0.0000s
> 0.0002s
> Gen 1 8 colls, 0 par 0.07s 0.10s 0.0127s
> 0.0479s
>
> INIT time 0.00s ( 0.00s elapsed)
> MUT time 0.24s ( 0.24s elapsed)
> GC time 0.13s ( 0.17s elapsed)
> EXIT time 0.00s ( 0.01s elapsed)
> Total time 0.37s ( 0.41s elapsed)
>
> %GC time 34.8% (40.3% elapsed)
>
> Alloc rate 4,923,204,681 bytes per MUT second
>
> Productivity 65.2% of total user, 59.0% of total elapsed
> }}}
>
> HEAD:
>
> {{{
> $ ./HashMapInsertNew +RTS -s
> 1,191,223,128 bytes allocated in the heap
> 231,158,688 bytes copied during GC
> 55,533,064 bytes maximum residency (13 sample(s))
> 22,378,488 bytes maximum slop
> 144 MB total memory in use (0 MB lost due to fragmentation)
>
> Tot time (elapsed) Avg pause Max
> pause
> Gen 0 2268 colls, 0 par 0.06s 0.07s 0.0000s
> 0.0003s
> Gen 1 13 colls, 0 par 0.12s 0.16s 0.0127s
> 0.0468s
>
> INIT time 0.00s ( 0.00s elapsed)
> MUT time 0.25s ( 0.25s elapsed)
> GC time 0.18s ( 0.23s elapsed)
> EXIT time 0.00s ( 0.01s elapsed)
> Total time 0.43s ( 0.49s elapsed)
>
> %GC time 41.6% (47.5% elapsed)
>
> Alloc rate 4,738,791,249 bytes per MUT second
>
> Productivity 58.3% of total user, 51.9% of total elapsed
> }}}
>
> (Note that this is without the patches in #8885, so they're not the
> cause.)
>
> An interesting difference is that we spend more time in GC in HEAD. I
> don't know if that's related.
New description:
I ran a simple benchmark that exercises [https://github.com/tibbe
/unordered-containers/blob/master/Data/HashMap/Base.hs#L303
Data.HashMap.Lazy.insert]. It's 16% slower using HEAD compared to using
7.6.3. The generated Core is a bit different and the generated Cmm is
quite a bit different.
'''Steps to reproduce'''
1. Download the attached `HashMapInsert.hs` benchmark.
2. Install unordered-containers with both 7.6.3 and HEAD:
{{{
$ cabal install -w ghc-7.6.3 unordered-containers-0.2.3.3
$ cabal install -w inplace/bin/ghc-stage2 unordered-containers-0.2.3.3
}}}
3. Compile the benchmark with both compilers:
{{{
$ ghc-7.6.3 -O2 HashMapInsert.hs
$ mv HashMapInsert HashMapInsertOld
$ inplace/bin/ghc-stage2 -O2 HashMapInsert.hs
$ mv HashMapInsert HashMapInsertNew
}}}
'''Results (best of 3 runs)'''
7.6.3
{{{
$ ./HashMapInsertOld +RTS -s
1,191,223,528 bytes allocated in the heap
141,978,520 bytes copied during GC
37,811,840 bytes maximum residency (8 sample(s))
22,378,432 bytes maximum slop
99 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max
pause
Gen 0 2277 colls, 0 par 0.06s 0.06s 0.0000s
0.0002s
Gen 1 8 colls, 0 par 0.07s 0.10s 0.0127s
0.0479s
INIT time 0.00s ( 0.00s elapsed)
MUT time 0.24s ( 0.24s elapsed)
GC time 0.13s ( 0.17s elapsed)
EXIT time 0.00s ( 0.01s elapsed)
Total time 0.37s ( 0.41s elapsed)
%GC time 34.8% (40.3% elapsed)
Alloc rate 4,923,204,681 bytes per MUT second
Productivity 65.2% of total user, 59.0% of total elapsed
}}}
HEAD:
{{{
$ ./HashMapInsertNew +RTS -s
1,191,223,128 bytes allocated in the heap
231,158,688 bytes copied during GC
55,533,064 bytes maximum residency (13 sample(s))
22,378,488 bytes maximum slop
144 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max
pause
Gen 0 2268 colls, 0 par 0.06s 0.07s 0.0000s
0.0003s
Gen 1 13 colls, 0 par 0.12s 0.16s 0.0127s
0.0468s
INIT time 0.00s ( 0.00s elapsed)
MUT time 0.25s ( 0.25s elapsed)
GC time 0.18s ( 0.23s elapsed)
EXIT time 0.00s ( 0.01s elapsed)
Total time 0.43s ( 0.49s elapsed)
%GC time 41.6% (47.5% elapsed)
Alloc rate 4,738,791,249 bytes per MUT second
Productivity 58.3% of total user, 51.9% of total elapsed
}}}
(Note that this is without the patches in #8885, so they're not the
cause.)
An interesting difference is that we spend more time in GC in HEAD. I
don't know if that's related.
--
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/8900#comment:5>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list