[Haskell-cafe] Unbelievable parallel speedup
John D. Ramsdell
ramsdell0 at gmail.com
Fri Jun 3 14:10:55 CEST 2011
I've enjoyed reading Simon Marlow's new tutorial on parallel and
concurrent programming, and learned some surprisingly basic tricks. I
didn't know about the '-s' runtime option for printing statistics. I
decided to compute speedups for a program I wrote just as Simon did,
after running the program on an unloaded machine with four processors.
When I did, I found the speedup on two processors was 2.4, on three
it was 3.2, and on four it was 4.4! Am I living in a dream world?
I ran the test nine more times, and here is a table of the speedups.
2.35975 3.42595 4.39351
1.57458 2.18623 2.94045
1.83232 2.77858 3.41629
1.58011 2.37084 2.94913
2.36678 3.63694 4.42066
1.58199 2.29053 2.95165
1.57656 2.34844 2.94683
1.58143 2.3242 2.95098
2.36703 3.36802 4.41918
1.58341 2.30123 2.93933
That last line looks pretty reasonable to me, and is what I expected.
Let's look at a table of the elapse times.
415.67 176.15 121.33 94.61
277.52 176.25 126.94 94.38
321.37 175.39 115.66 94.07
277.72 175.76 117.14 94.17
415.63 175.61 114.28 94.02
277.75 175.57 121.26 94.10
277.68 176.13 118.24 94.23
277.51 175.48 119.40 94.04
415.58 175.57 123.39 94.04
277.62 175.33 120.64 94.45
Notice that the elapse times for two and four processors is pretty
consistent, and the one for three processors is a little inconsistent,
but the times for the single processor case are all over the map. Can
anyone explain all this variance?
I have enclosed the raw output from the runs and the script that was
run ten times to produce the output.
John
-------------- next part --------------
The Glorious Glasgow Haskell Compilation System, version 6.12.1
CPSA 2.2.4
Tue May 31 10:32:43 EDT 2011
cpsa -o /dev/null kelly64.lisp +RTS -s -N1
125,661,010,200 bytes allocated in the heap
11,585,760,256 bytes copied during GC
40,937,760 bytes maximum residency (170 sample(s))
1,003,736 bytes maximum slop
121 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 239519 collections, 0 parallel, 23.45s, 24.11s elapsed
Generation 1: 170 collections, 0 parallel, 11.05s, 11.04s elapsed
Parallel GC work balance: -nan (0 / 0, ideal 1)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 0.00s (380.48s) 0.00s ( 0.00s)
Task 1 (worker) : 0.00s (380.52s) 0.00s ( 0.00s)
Task 2 (worker) : 381.17s (380.52s) 34.50s ( 35.15s)
SPARKS: 1991 (0 converted, 1991 pruned)
INIT time 0.00s ( 0.00s elapsed)
MUT time 380.54s (380.52s elapsed)
GC time 34.50s ( 35.15s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 415.04s (415.67s elapsed)
%GC time 8.3% (8.5% elapsed)
Alloc rate 330,217,612 bytes per MUT second
Productivity 91.7% of total user, 91.5% of total elapsed
gc_alloc_block_sync: 0
whitehole_spin: 0
gen[0].steps[0].sync_large_objects: 0
gen[0].steps[1].sync_large_objects: 0
gen[1].steps[0].sync_large_objects: 0
cpsa -o /dev/null kelly64.lisp +RTS -s -N2
125,667,495,008 bytes allocated in the heap
12,255,885,312 bytes copied during GC
41,007,904 bytes maximum residency (175 sample(s))
1,376,952 bytes maximum slop
122 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 154004 collections, 154003 parallel, 28.81s, 15.99s elapsed
Generation 1: 175 collections, 175 parallel, 12.76s, 6.45s elapsed
Parallel GC work balance: 1.47 (1530010350 / 1040623855, ideal 2)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 160.52s (153.65s) 8.96s ( 4.72s)
Task 1 (worker) : 141.32s (153.71s) 30.85s ( 16.28s)
Task 2 (worker) : 7.42s (153.71s) 1.76s ( 1.44s)
Task 3 (worker) : 0.00s (153.71s) 0.00s ( 0.00s)
SPARKS: 1991 (1954 converted, 34 pruned)
INIT time 0.01s ( 0.00s elapsed)
MUT time 308.73s (153.71s elapsed)
GC time 41.57s ( 22.44s elapsed)
EXIT time 0.01s ( 0.00s elapsed)
Total time 350.32s (176.15s elapsed)
%GC time 11.9% (12.7% elapsed)
Alloc rate 407,020,226 bytes per MUT second
Productivity 88.1% of total user, 175.3% of total elapsed
gc_alloc_block_sync: 882643
whitehole_spin: 0
gen[0].steps[0].sync_large_objects: 0
gen[0].steps[1].sync_large_objects: 0
gen[1].steps[0].sync_large_objects: 116
cpsa -o /dev/null kelly64.lisp +RTS -s -N3
125,669,549,984 bytes allocated in the heap
12,389,102,416 bytes copied during GC
41,244,464 bytes maximum residency (187 sample(s))
1,492,544 bytes maximum slop
123 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 113548 collections, 113547 parallel, 34.73s, 13.01s elapsed
Generation 1: 187 collections, 187 parallel, 15.38s, 5.27s elapsed
Parallel GC work balance: 1.90 (1546573583 / 813008149, ideal 3)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 106.00s (103.03s) 8.89s ( 3.12s)
Task 1 (worker) : 92.45s (103.06s) 25.65s ( 9.26s)
Task 2 (worker) : 7.11s (103.06s) 3.61s ( 1.55s)
Task 3 (worker) : 0.00s (103.06s) 0.00s ( 0.00s)
Task 4 (worker) : 105.18s (103.06s) 11.96s ( 4.35s)
SPARKS: 1991 (1954 converted, 36 pruned)
INIT time 0.01s ( 0.00s elapsed)
MUT time 310.06s (103.06s elapsed)
GC time 50.11s ( 18.28s elapsed)
EXIT time 0.01s ( 0.00s elapsed)
Total time 360.19s (121.33s elapsed)
%GC time 13.9% (15.1% elapsed)
Alloc rate 405,281,056 bytes per MUT second
Productivity 86.1% of total user, 255.6% of total elapsed
gc_alloc_block_sync: 1790477
whitehole_spin: 0
gen[0].steps[0].sync_large_objects: 18
gen[0].steps[1].sync_large_objects: 16
gen[1].steps[0].sync_large_objects: 1427
cpsa -o /dev/null kelly64.lisp +RTS -s -N4
125,670,260,336 bytes allocated in the heap
12,483,982,000 bytes copied during GC
41,210,864 bytes maximum residency (193 sample(s))
1,639,208 bytes maximum slop
124 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 92336 collections, 92335 parallel, 40.90s, 11.55s elapsed
Generation 1: 193 collections, 193 parallel, 18.84s, 4.79s elapsed
Parallel GC work balance: 2.31 (1558312294 / 675211109, ideal 4)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 80.51s ( 77.93s) 6.17s ( 1.73s)
Task 1 (worker) : 68.11s ( 78.27s) 22.41s ( 5.91s)
Task 2 (worker) : 5.66s ( 78.27s) 4.45s ( 1.56s)
Task 3 (worker) : 0.00s ( 78.27s) 0.00s ( 0.00s)
Task 4 (worker) : 88.57s ( 78.27s) 4.77s ( 1.28s)
Task 5 (worker) : 69.51s ( 78.27s) 21.94s ( 5.84s)
SPARKS: 1991 (1953 converted, 38 pruned)
INIT time 0.01s ( 0.00s elapsed)
MUT time 311.87s ( 78.27s elapsed)
GC time 59.74s ( 16.34s elapsed)
EXIT time 0.01s ( 0.00s elapsed)
Total time 371.63s ( 94.61s elapsed)
%GC time 16.1% (17.3% elapsed)
Alloc rate 402,931,355 bytes per MUT second
Productivity 83.9% of total user, 329.7% of total elapsed
gc_alloc_block_sync: 2872016
whitehole_spin: 0
gen[0].steps[0].sync_large_objects: 69
gen[0].steps[1].sync_large_objects: 78
gen[1].steps[0].sync_large_objects: 2399
Tue May 31 10:53:06 EDT 2011
----------------------------------------------
The Glorious Glasgow Haskell Compilation System, version 6.12.1
CPSA 2.2.4
Thu Jun 2 20:55:11 EDT 2011
cpsa -o /dev/null kelly64.lisp +RTS -s -N1
125,661,010,200 bytes allocated in the heap
11,585,761,480 bytes copied during GC
40,937,760 bytes maximum residency (170 sample(s))
1,003,736 bytes maximum slop
121 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 239519 collections, 0 parallel, 15.91s, 16.03s elapsed
Generation 1: 170 collections, 0 parallel, 7.86s, 7.79s elapsed
Parallel GC work balance: -nan (0 / 0, ideal 1)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 0.00s (253.67s) 0.00s ( 0.00s)
Task 1 (worker) : 0.00s (253.70s) 0.00s ( 0.00s)
Task 2 (worker) : 253.75s (253.70s) 23.77s ( 23.82s)
SPARKS: 1991 (0 converted, 1991 pruned)
INIT time 0.00s ( 0.00s elapsed)
MUT time 253.35s (253.70s elapsed)
GC time 23.77s ( 23.82s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 277.12s (277.52s elapsed)
%GC time 8.6% (8.6% elapsed)
Alloc rate 495,997,671 bytes per MUT second
Productivity 91.4% of total user, 91.3% of total elapsed
gc_alloc_block_sync: 0
whitehole_spin: 0
gen[0].steps[0].sync_large_objects: 0
gen[0].steps[1].sync_large_objects: 0
gen[1].steps[0].sync_large_objects: 0
cpsa -o /dev/null kelly64.lisp +RTS -s -N2
125,667,655,624 bytes allocated in the heap
12,252,025,136 bytes copied during GC
41,117,352 bytes maximum residency (194 sample(s))
1,400,112 bytes maximum slop
122 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 154076 collections, 154075 parallel, 27.98s, 15.75s elapsed
Generation 1: 194 collections, 194 parallel, 12.91s, 6.52s elapsed
Parallel GC work balance: 1.47 (1529554452 / 1041065277, ideal 2)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 161.97s (153.97s) 10.85s ( 5.72s)
Task 1 (worker) : 141.59s (153.98s) 28.17s ( 15.08s)
Task 2 (worker) : 7.03s (153.98s) 1.87s ( 1.48s)
Task 3 (worker) : 0.00s (153.98s) 0.00s ( 0.00s)
SPARKS: 1991 (1953 converted, 38 pruned)
INIT time 0.01s ( 0.00s elapsed)
MUT time 310.10s (153.98s elapsed)
GC time 40.89s ( 22.27s elapsed)
EXIT time 0.01s ( 0.00s elapsed)
Total time 351.01s (176.25s elapsed)
%GC time 11.6% (12.6% elapsed)
Alloc rate 405,222,673 bytes per MUT second
Productivity 88.3% of total user, 175.9% of total elapsed
gc_alloc_block_sync: 855244
whitehole_spin: 0
gen[0].steps[0].sync_large_objects: 5
gen[0].steps[1].sync_large_objects: 0
gen[1].steps[0].sync_large_objects: 229
cpsa -o /dev/null kelly64.lisp +RTS -s -N3
125,669,907,832 bytes allocated in the heap
12,437,872,000 bytes copied during GC
41,026,336 bytes maximum residency (176 sample(s))
1,471,616 bytes maximum slop
123 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 115627 collections, 115626 parallel, 37.79s, 13.53s elapsed
Generation 1: 176 collections, 176 parallel, 16.43s, 5.55s elapsed
Parallel GC work balance: 1.86 (1552672384 / 836913872, ideal 3)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 91.48s (107.85s) 28.23s ( 9.62s)
Task 1 (worker) : 108.30s (107.86s) 15.85s ( 5.46s)
Task 2 (worker) : 5.56s (107.86s) 2.92s ( 1.37s)
Task 3 (worker) : 0.00s (107.86s) 0.00s ( 0.00s)
Task 4 (worker) : 117.93s (107.86s) 7.22s ( 2.63s)
SPARKS: 1991 (1954 converted, 37 pruned)
INIT time 0.01s ( 0.00s elapsed)
MUT time 322.79s (107.86s elapsed)
GC time 54.22s ( 19.08s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 377.02s (126.94s elapsed)
%GC time 14.4% (15.0% elapsed)
Alloc rate 389,311,982 bytes per MUT second
Productivity 85.6% of total user, 254.3% of total elapsed
gc_alloc_block_sync: 1587630
whitehole_spin: 1
gen[0].steps[0].sync_large_objects: 34
gen[0].steps[1].sync_large_objects: 39
gen[1].steps[0].sync_large_objects: 1360
cpsa -o /dev/null kelly64.lisp +RTS -s -N4
125,670,497,704 bytes allocated in the heap
12,503,248,968 bytes copied during GC
41,344,696 bytes maximum residency (182 sample(s))
1,636,464 bytes maximum slop
124 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 92471 collections, 92470 parallel, 41.24s, 11.34s elapsed
Generation 1: 182 collections, 182 parallel, 19.43s, 4.92s elapsed
Parallel GC work balance: 2.29 (1560718711 / 680490474, ideal 4)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 70.32s ( 78.10s) 21.45s ( 5.67s)
Task 1 (worker) : 79.49s ( 78.12s) 7.83s ( 2.00s)
Task 2 (worker) : 5.33s ( 78.12s) 4.08s ( 1.35s)
Task 3 (worker) : 0.00s ( 78.12s) 0.00s ( 0.00s)
Task 4 (worker) : 71.13s ( 78.12s) 21.36s ( 5.71s)
Task 5 (worker) : 84.89s ( 78.12s) 5.95s ( 1.52s)
SPARKS: 1991 (1955 converted, 36 pruned)
INIT time 0.01s ( 0.00s elapsed)
MUT time 310.59s ( 78.12s elapsed)
GC time 60.67s ( 16.26s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 371.27s ( 94.38s elapsed)
%GC time 16.3% (17.2% elapsed)
Alloc rate 404,605,594 bytes per MUT second
Productivity 83.7% of total user, 329.1% of total elapsed
gc_alloc_block_sync: 2787413
whitehole_spin: 1
gen[0].steps[0].sync_large_objects: 120
gen[0].steps[1].sync_large_objects: 188
gen[1].steps[0].sync_large_objects: 2182
Thu Jun 2 21:11:04 EDT 2011
----------------------------------------------
The Glorious Glasgow Haskell Compilation System, version 6.12.1
CPSA 2.2.4
Thu Jun 2 21:11:04 EDT 2011
cpsa -o /dev/null kelly64.lisp +RTS -s -N1
125,661,010,200 bytes allocated in the heap
11,585,761,896 bytes copied during GC
40,937,760 bytes maximum residency (170 sample(s))
1,003,736 bytes maximum slop
121 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 239519 collections, 0 parallel, 18.65s, 18.87s elapsed
Generation 1: 170 collections, 0 parallel, 8.86s, 8.97s elapsed
Parallel GC work balance: -nan (0 / 0, ideal 1)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 0.00s (293.48s) 0.00s ( 0.00s)
Task 1 (worker) : 0.00s (293.52s) 0.00s ( 0.00s)
Task 2 (worker) : 293.85s (293.52s) 27.51s ( 27.84s)
SPARKS: 1991 (0 converted, 1991 pruned)
INIT time 0.00s ( 0.00s elapsed)
MUT time 293.34s (293.52s elapsed)
GC time 27.51s ( 27.84s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 320.85s (321.37s elapsed)
%GC time 8.6% (8.7% elapsed)
Alloc rate 428,380,071 bytes per MUT second
Productivity 91.4% of total user, 91.3% of total elapsed
gc_alloc_block_sync: 0
whitehole_spin: 0
gen[0].steps[0].sync_large_objects: 0
gen[0].steps[1].sync_large_objects: 0
gen[1].steps[0].sync_large_objects: 0
cpsa -o /dev/null kelly64.lisp +RTS -s -N2
125,667,234,200 bytes allocated in the heap
12,255,926,040 bytes copied during GC
41,096,080 bytes maximum residency (175 sample(s))
1,354,936 bytes maximum slop
122 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 154354 collections, 154353 parallel, 28.81s, 15.76s elapsed
Generation 1: 175 collections, 175 parallel, 12.51s, 6.40s elapsed
Parallel GC work balance: 1.47 (1530007390 / 1038769111, ideal 2)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 158.79s (153.20s) 9.04s ( 4.71s)
Task 1 (worker) : 143.46s (153.23s) 28.80s ( 15.97s)
Task 2 (worker) : 5.84s (153.23s) 3.48s ( 1.48s)
Task 3 (worker) : 0.00s (153.23s) 0.00s ( 0.00s)
SPARKS: 1991 (1953 converted, 37 pruned)
INIT time 0.01s ( 0.00s elapsed)
MUT time 307.55s (153.23s elapsed)
GC time 41.32s ( 22.16s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 348.88s (175.39s elapsed)
%GC time 11.8% (12.6% elapsed)
Alloc rate 408,594,206 bytes per MUT second
Productivity 88.2% of total user, 175.4% of total elapsed
gc_alloc_block_sync: 879111
whitehole_spin: 0
gen[0].steps[0].sync_large_objects: 0
gen[0].steps[1].sync_large_objects: 0
gen[1].steps[0].sync_large_objects: 264
cpsa -o /dev/null kelly64.lisp +RTS -s -N3
125,670,613,312 bytes allocated in the heap
12,375,709,544 bytes copied during GC
41,096,136 bytes maximum residency (182 sample(s))
1,546,872 bytes maximum slop
123 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 113087 collections, 113086 parallel, 32.86s, 12.22s elapsed
Generation 1: 182 collections, 182 parallel, 14.96s, 5.11s elapsed
Parallel GC work balance: 1.90 (1544893913 / 811201917, ideal 3)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 103.07s ( 98.29s) 5.36s ( 2.12s)
Task 1 (worker) : 93.40s ( 98.32s) 18.51s ( 6.80s)
Task 2 (worker) : 5.32s ( 98.32s) 4.45s ( 1.53s)
Task 3 (worker) : 0.00s ( 98.32s) 0.00s ( 0.00s)
Task 4 (worker) : 94.19s ( 98.32s) 19.50s ( 6.88s)
SPARKS: 1991 (1954 converted, 34 pruned)
INIT time 0.01s ( 0.00s elapsed)
MUT time 295.39s ( 98.32s elapsed)
GC time 47.82s ( 17.34s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 343.22s (115.66s elapsed)
%GC time 13.9% (15.0% elapsed)
Alloc rate 425,425,231 bytes per MUT second
Productivity 86.1% of total user, 255.4% of total elapsed
gc_alloc_block_sync: 1921908
whitehole_spin: 2
gen[0].steps[0].sync_large_objects: 32
gen[0].steps[1].sync_large_objects: 11
gen[1].steps[0].sync_large_objects: 385
cpsa -o /dev/null kelly64.lisp +RTS -s -N4
125,669,557,120 bytes allocated in the heap
12,525,784,632 bytes copied during GC
41,302,400 bytes maximum residency (183 sample(s))
1,625,272 bytes maximum slop
124 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 93045 collections, 93044 parallel, 42.78s, 11.10s elapsed
Generation 1: 183 collections, 183 parallel, 19.99s, 5.08s elapsed
Parallel GC work balance: 2.23 (1563569121 / 701387257, ideal 4)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 66.18s ( 77.85s) 21.71s ( 5.57s)
Task 1 (worker) : 68.99s ( 77.88s) 22.59s ( 5.87s)
Task 2 (worker) : 2.71s ( 77.88s) 5.45s ( 1.31s)
Task 3 (worker) : 0.00s ( 77.88s) 0.00s ( 0.00s)
Task 4 (worker) : 86.10s ( 77.88s) 6.39s ( 1.61s)
Task 5 (worker) : 84.28s ( 77.88s) 6.63s ( 1.82s)
SPARKS: 1991 (1953 converted, 37 pruned)
INIT time 0.01s ( 0.00s elapsed)
MUT time 307.84s ( 77.88s elapsed)
GC time 62.77s ( 16.18s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 370.62s ( 94.07s elapsed)
%GC time 16.9% (17.2% elapsed)
Alloc rate 408,216,849 bytes per MUT second
Productivity 83.1% of total user, 327.3% of total elapsed
gc_alloc_block_sync: 2735957
whitehole_spin: 2
gen[0].steps[0].sync_large_objects: 134
gen[0].steps[1].sync_large_objects: 145
gen[1].steps[0].sync_large_objects: 2501
Thu Jun 2 21:27:59 EDT 2011
----------------------------------------------
The Glorious Glasgow Haskell Compilation System, version 6.12.1
CPSA 2.2.4
Thu Jun 2 21:57:10 EDT 2011
cpsa -o /dev/null kelly64.lisp +RTS -s -N1
125,661,010,200 bytes allocated in the heap
11,585,761,624 bytes copied during GC
40,937,760 bytes maximum residency (170 sample(s))
1,003,736 bytes maximum slop
121 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 239519 collections, 0 parallel, 16.07s, 16.21s elapsed
Generation 1: 170 collections, 0 parallel, 7.70s, 7.83s elapsed
Parallel GC work balance: -nan (0 / 0, ideal 1)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 0.00s (253.66s) 0.00s ( 0.00s)
Task 1 (worker) : 0.00s (253.68s) 0.00s ( 0.00s)
Task 2 (worker) : 253.95s (253.68s) 23.77s ( 24.04s)
SPARKS: 1991 (0 converted, 1991 pruned)
INIT time 0.00s ( 0.00s elapsed)
MUT time 253.56s (253.68s elapsed)
GC time 23.77s ( 24.04s elapsed)
EXIT time 0.01s ( 0.00s elapsed)
Total time 277.34s (277.72s elapsed)
%GC time 8.6% (8.7% elapsed)
Alloc rate 495,567,339 bytes per MUT second
Productivity 91.4% of total user, 91.3% of total elapsed
gc_alloc_block_sync: 0
whitehole_spin: 0
gen[0].steps[0].sync_large_objects: 0
gen[0].steps[1].sync_large_objects: 0
gen[1].steps[0].sync_large_objects: 0
cpsa -o /dev/null kelly64.lisp +RTS -s -N2
125,668,128,592 bytes allocated in the heap
12,254,558,304 bytes copied during GC
41,065,184 bytes maximum residency (191 sample(s))
1,402,080 bytes maximum slop
122 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 154246 collections, 154245 parallel, 28.00s, 15.90s elapsed
Generation 1: 191 collections, 191 parallel, 12.57s, 6.36s elapsed
Parallel GC work balance: 1.46 (1529837981 / 1046401625, ideal 2)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 143.44s (153.49s) 28.38s ( 15.02s)
Task 1 (worker) : 157.72s (153.50s) 10.53s ( 5.97s)
Task 2 (worker) : 8.46s (153.50s) 1.66s ( 1.26s)
Task 3 (worker) : 0.00s (153.50s) 0.00s ( 0.00s)
SPARKS: 1991 (1953 converted, 38 pruned)
INIT time 0.01s ( 0.00s elapsed)
MUT time 308.99s (153.50s elapsed)
GC time 40.57s ( 22.26s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 349.57s (175.76s elapsed)
%GC time 11.6% (12.7% elapsed)
Alloc rate 406,692,972 bytes per MUT second
Productivity 88.4% of total user, 175.8% of total elapsed
gc_alloc_block_sync: 890896
whitehole_spin: 0
gen[0].steps[0].sync_large_objects: 4
gen[0].steps[1].sync_large_objects: 0
gen[1].steps[0].sync_large_objects: 277
cpsa -o /dev/null kelly64.lisp +RTS -s -N3
125,670,120,400 bytes allocated in the heap
12,374,421,912 bytes copied during GC
41,017,256 bytes maximum residency (179 sample(s))
1,546,000 bytes maximum slop
123 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 113018 collections, 113017 parallel, 33.93s, 12.53s elapsed
Generation 1: 179 collections, 179 parallel, 15.64s, 5.23s elapsed
Parallel GC work balance: 1.89 (1544747811 / 816043792, ideal 3)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 95.82s ( 99.35s) 14.36s ( 4.90s)
Task 1 (worker) : 103.03s ( 99.38s) 10.31s ( 3.70s)
Task 2 (worker) : 6.97s ( 99.38s) 2.96s ( 1.38s)
Task 3 (worker) : 0.00s ( 99.38s) 0.00s ( 0.00s)
Task 4 (worker) : 92.98s ( 99.38s) 21.94s ( 7.78s)
SPARKS: 1991 (1954 converted, 35 pruned)
INIT time 0.01s ( 0.00s elapsed)
MUT time 298.25s ( 99.38s elapsed)
GC time 49.57s ( 17.76s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 347.83s (117.14s elapsed)
%GC time 14.3% (15.2% elapsed)
Alloc rate 421,344,197 bytes per MUT second
Productivity 85.7% of total user, 254.6% of total elapsed
gc_alloc_block_sync: 1901383
whitehole_spin: 0
gen[0].steps[0].sync_large_objects: 12
gen[0].steps[1].sync_large_objects: 23
gen[1].steps[0].sync_large_objects: 744
cpsa -o /dev/null kelly64.lisp +RTS -s -N4
125,669,648,808 bytes allocated in the heap
12,500,417,616 bytes copied during GC
41,268,944 bytes maximum residency (196 sample(s))
1,664,944 bytes maximum slop
124 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 92399 collections, 92398 parallel, 40.80s, 11.26s elapsed
Generation 1: 196 collections, 196 parallel, 19.18s, 4.86s elapsed
Parallel GC work balance: 2.29 (1560376264 / 681968454, ideal 4)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 82.75s ( 78.03s) 4.48s ( 1.30s)
Task 1 (worker) : 68.37s ( 78.04s) 23.40s ( 6.13s)
Task 2 (worker) : 5.24s ( 78.04s) 4.15s ( 1.39s)
Task 3 (worker) : 0.00s ( 78.04s) 0.00s ( 0.00s)
Task 4 (worker) : 67.16s ( 78.04s) 23.44s ( 6.05s)
Task 5 (worker) : 87.46s ( 78.04s) 4.51s ( 1.25s)
SPARKS: 1991 (1953 converted, 36 pruned)
INIT time 0.01s ( 0.00s elapsed)
MUT time 310.33s ( 78.04s elapsed)
GC time 59.98s ( 16.12s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 370.32s ( 94.17s elapsed)
%GC time 16.2% (17.1% elapsed)
Alloc rate 404,941,834 bytes per MUT second
Productivity 83.8% of total user, 329.6% of total elapsed
gc_alloc_block_sync: 2819027
whitehole_spin: 2
gen[0].steps[0].sync_large_objects: 139
gen[0].steps[1].sync_large_objects: 124
gen[1].steps[0].sync_large_objects: 2639
Thu Jun 2 22:12:51 EDT 2011
----------------------------------------------
The Glorious Glasgow Haskell Compilation System, version 6.12.1
CPSA 2.2.4
Thu Jun 2 22:12:51 EDT 2011
cpsa -o /dev/null kelly64.lisp +RTS -s -N1
125,661,010,200 bytes allocated in the heap
11,585,760,464 bytes copied during GC
40,937,760 bytes maximum residency (170 sample(s))
1,003,736 bytes maximum slop
121 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 239519 collections, 0 parallel, 23.90s, 24.04s elapsed
Generation 1: 170 collections, 0 parallel, 11.07s, 11.06s elapsed
Parallel GC work balance: -nan (0 / 0, ideal 1)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 0.00s (380.49s) 0.00s ( 0.00s)
Task 1 (worker) : 0.00s (380.53s) 0.00s ( 0.00s)
Task 2 (worker) : 380.64s (380.53s) 34.97s ( 35.10s)
SPARKS: 1991 (0 converted, 1991 pruned)
INIT time 0.00s ( 0.00s elapsed)
MUT time 380.12s (380.53s elapsed)
GC time 34.97s ( 35.10s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 415.09s (415.63s elapsed)
%GC time 8.4% (8.4% elapsed)
Alloc rate 330,582,474 bytes per MUT second
Productivity 91.6% of total user, 91.5% of total elapsed
gc_alloc_block_sync: 0
whitehole_spin: 0
gen[0].steps[0].sync_large_objects: 0
gen[0].steps[1].sync_large_objects: 0
gen[1].steps[0].sync_large_objects: 0
cpsa -o /dev/null kelly64.lisp +RTS -s -N2
125,668,253,136 bytes allocated in the heap
12,256,606,280 bytes copied during GC
40,876,392 bytes maximum residency (177 sample(s))
1,422,152 bytes maximum slop
122 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 154105 collections, 154104 parallel, 28.82s, 15.93s elapsed
Generation 1: 177 collections, 177 parallel, 12.61s, 6.39s elapsed
Parallel GC work balance: 1.47 (1530092450 / 1039667442, ideal 2)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 156.26s (153.27s) 9.06s ( 4.77s)
Task 1 (worker) : 143.74s (153.30s) 30.35s ( 16.01s)
Task 2 (worker) : 8.41s (153.30s) 2.02s ( 1.53s)
Task 3 (worker) : 0.00s (153.30s) 0.00s ( 0.00s)
SPARKS: 1991 (1953 converted, 37 pruned)
INIT time 0.01s ( 0.00s elapsed)
MUT time 307.89s (153.30s elapsed)
GC time 41.43s ( 22.31s elapsed)
EXIT time 0.01s ( 0.00s elapsed)
Total time 349.34s (175.61s elapsed)
%GC time 11.9% (12.7% elapsed)
Alloc rate 408,133,068 bytes per MUT second
Productivity 88.1% of total user, 175.3% of total elapsed
gc_alloc_block_sync: 908090
whitehole_spin: 0
gen[0].steps[0].sync_large_objects: 0
gen[0].steps[1].sync_large_objects: 0
gen[1].steps[0].sync_large_objects: 268
cpsa -o /dev/null kelly64.lisp +RTS -s -N3
125,669,804,536 bytes allocated in the heap
12,370,395,928 bytes copied during GC
41,148,688 bytes maximum residency (182 sample(s))
1,509,824 bytes maximum slop
123 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 112602 collections, 112601 parallel, 32.46s, 12.26s elapsed
Generation 1: 182 collections, 182 parallel, 14.89s, 5.11s elapsed
Parallel GC work balance: 1.92 (1544243075 / 804816337, ideal 3)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 102.59s ( 96.88s) 4.89s ( 1.61s)
Task 1 (worker) : 91.55s ( 96.91s) 19.36s ( 6.98s)
Task 2 (worker) : 7.42s ( 96.91s) 3.23s ( 1.46s)
Task 3 (worker) : 0.00s ( 96.91s) 0.00s ( 0.00s)
Task 4 (worker) : 91.09s ( 96.91s) 19.87s ( 7.32s)
SPARKS: 1991 (1954 converted, 35 pruned)
INIT time 0.00s ( 0.00s elapsed)
MUT time 292.21s ( 96.91s elapsed)
GC time 47.35s ( 17.37s elapsed)
EXIT time 0.01s ( 0.00s elapsed)
Total time 339.57s (114.28s elapsed)
%GC time 13.9% (15.2% elapsed)
Alloc rate 430,052,031 bytes per MUT second
Productivity 86.1% of total user, 255.7% of total elapsed
gc_alloc_block_sync: 1954078
whitehole_spin: 1
gen[0].steps[0].sync_large_objects: 19
gen[0].steps[1].sync_large_objects: 23
gen[1].steps[0].sync_large_objects: 277
cpsa -o /dev/null kelly64.lisp +RTS -s -N4
125,670,550,632 bytes allocated in the heap
12,515,148,704 bytes copied during GC
41,292,000 bytes maximum residency (200 sample(s))
1,617,712 bytes maximum slop
124 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 93200 collections, 93199 parallel, 42.05s, 11.13s elapsed
Generation 1: 200 collections, 200 parallel, 19.82s, 5.06s elapsed
Parallel GC work balance: 2.23 (1562231663 / 699553894, ideal 4)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 62.77s ( 77.80s) 26.71s ( 6.76s)
Task 1 (worker) : 67.83s ( 77.83s) 22.83s ( 5.96s)
Task 2 (worker) : 4.54s ( 77.83s) 4.40s ( 1.37s)
Task 3 (worker) : 0.00s ( 77.83s) 0.00s ( 0.00s)
Task 4 (worker) : 85.97s ( 77.83s) 3.93s ( 1.11s)
Task 5 (worker) : 87.95s ( 77.83s) 4.00s ( 0.99s)
SPARKS: 1991 (1955 converted, 34 pruned)
INIT time 0.01s ( 0.00s elapsed)
MUT time 308.49s ( 77.83s elapsed)
GC time 61.87s ( 16.19s elapsed)
EXIT time 0.01s ( 0.00s elapsed)
Total time 370.38s ( 94.02s elapsed)
%GC time 16.7% (17.2% elapsed)
Alloc rate 407,346,765 bytes per MUT second
Productivity 83.3% of total user, 328.1% of total elapsed
gc_alloc_block_sync: 2753170
whitehole_spin: 0
gen[0].steps[0].sync_large_objects: 103
gen[0].steps[1].sync_large_objects: 114
gen[1].steps[0].sync_large_objects: 2640
Thu Jun 2 22:32:04 EDT 2011
----------------------------------------------
The Glorious Glasgow Haskell Compilation System, version 6.12.1
CPSA 2.2.4
Thu Jun 2 22:32:04 EDT 2011
cpsa -o /dev/null kelly64.lisp +RTS -s -N1
125,661,010,200 bytes allocated in the heap
11,585,762,248 bytes copied during GC
40,937,760 bytes maximum residency (170 sample(s))
1,003,736 bytes maximum slop
121 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 239519 collections, 0 parallel, 15.40s, 16.27s elapsed
Generation 1: 170 collections, 0 parallel, 7.78s, 7.82s elapsed
Parallel GC work balance: -nan (0 / 0, ideal 1)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 0.00s (253.63s) 0.00s ( 0.00s)
Task 1 (worker) : 0.00s (253.66s) 0.00s ( 0.00s)
Task 2 (worker) : 254.57s (253.66s) 23.18s ( 24.09s)
SPARKS: 1991 (0 converted, 1991 pruned)
INIT time 0.00s ( 0.00s elapsed)
MUT time 254.13s (253.66s elapsed)
GC time 23.18s ( 24.09s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 277.31s (277.75s elapsed)
%GC time 8.4% (8.7% elapsed)
Alloc rate 494,475,308 bytes per MUT second
Productivity 91.6% of total user, 91.5% of total elapsed
gc_alloc_block_sync: 0
whitehole_spin: 0
gen[0].steps[0].sync_large_objects: 0
gen[0].steps[1].sync_large_objects: 0
gen[1].steps[0].sync_large_objects: 0
cpsa -o /dev/null kelly64.lisp +RTS -s -N2
125,668,068,856 bytes allocated in the heap
12,259,143,512 bytes copied during GC
41,143,704 bytes maximum residency (191 sample(s))
1,439,040 bytes maximum slop
122 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 154422 collections, 154421 parallel, 28.89s, 15.76s elapsed
Generation 1: 191 collections, 191 parallel, 12.26s, 6.34s elapsed
Parallel GC work balance: 1.46 (1530417355 / 1046288258, ideal 2)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 140.60s (153.46s) 31.29s ( 16.41s)
Task 1 (worker) : 161.26s (153.47s) 8.18s ( 4.31s)
Task 2 (worker) : 6.62s (153.47s) 1.68s ( 1.38s)
Task 3 (worker) : 0.00s (153.47s) 0.00s ( 0.00s)
SPARKS: 1991 (1953 converted, 36 pruned)
INIT time 0.01s ( 0.00s elapsed)
MUT time 307.89s (153.47s elapsed)
GC time 41.15s ( 22.10s elapsed)
EXIT time 0.01s ( 0.00s elapsed)
Total time 349.06s (175.57s elapsed)
%GC time 11.8% (12.6% elapsed)
Alloc rate 408,132,470 bytes per MUT second
Productivity 88.2% of total user, 175.4% of total elapsed
gc_alloc_block_sync: 875007
whitehole_spin: 0
gen[0].steps[0].sync_large_objects: 4
gen[0].steps[1].sync_large_objects: 0
gen[1].steps[0].sync_large_objects: 291
cpsa -o /dev/null kelly64.lisp +RTS -s -N3
125,669,479,488 bytes allocated in the heap
12,391,810,800 bytes copied during GC
41,093,624 bytes maximum residency (199 sample(s))
1,498,880 bytes maximum slop
123 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 113857 collections, 113856 parallel, 36.12s, 13.00s elapsed
Generation 1: 199 collections, 199 parallel, 15.98s, 5.39s elapsed
Parallel GC work balance: 1.87 (1546908855 / 828654178, ideal 3)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 99.00s (102.87s) 15.65s ( 5.30s)
Task 1 (worker) : 111.91s (102.88s) 5.18s ( 1.89s)
Task 2 (worker) : 6.20s (102.88s) 3.78s ( 1.56s)
Task 3 (worker) : 91.65s (102.88s) 27.49s ( 9.64s)
Task 4 (worker) : 0.00s (102.88s) 0.00s ( 0.00s)
SPARKS: 1991 (1953 converted, 38 pruned)
INIT time 0.01s ( 0.00s elapsed)
MUT time 308.24s (102.88s elapsed)
GC time 52.10s ( 18.38s elapsed)
EXIT time 0.01s ( 0.00s elapsed)
Total time 360.36s (121.26s elapsed)
%GC time 14.5% (15.2% elapsed)
Alloc rate 407,673,650 bytes per MUT second
Productivity 85.5% of total user, 254.2% of total elapsed
gc_alloc_block_sync: 1672681
whitehole_spin: 0
gen[0].steps[0].sync_large_objects: 26
gen[0].steps[1].sync_large_objects: 34
gen[1].steps[0].sync_large_objects: 1429
cpsa -o /dev/null kelly64.lisp +RTS -s -N4
125,669,843,984 bytes allocated in the heap
12,486,967,376 bytes copied during GC
41,223,912 bytes maximum residency (202 sample(s))
1,620,800 bytes maximum slop
124 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 92351 collections, 92350 parallel, 41.51s, 11.25s elapsed
Generation 1: 202 collections, 202 parallel, 19.76s, 4.99s elapsed
Parallel GC work balance: 2.26 (1558703741 / 689207458, ideal 4)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 82.08s ( 77.82s) 6.23s ( 1.66s)
Task 1 (worker) : 86.95s ( 77.86s) 4.68s ( 1.32s)
Task 2 (worker) : 4.02s ( 77.86s) 5.67s ( 1.52s)
Task 3 (worker) : 0.00s ( 77.86s) 0.00s ( 0.00s)
Task 4 (worker) : 68.05s ( 77.86s) 22.18s ( 5.87s)
Task 5 (worker) : 68.92s ( 77.86s) 22.51s ( 5.87s)
SPARKS: 1991 (1953 converted, 38 pruned)
INIT time 0.01s ( 0.00s elapsed)
MUT time 309.37s ( 77.86s elapsed)
GC time 61.27s ( 16.24s elapsed)
EXIT time 0.01s ( 0.00s elapsed)
Total time 370.66s ( 94.10s elapsed)
%GC time 16.5% (17.3% elapsed)
Alloc rate 406,185,862 bytes per MUT second
Productivity 83.5% of total user, 328.8% of total elapsed
gc_alloc_block_sync: 2745648
whitehole_spin: 1
gen[0].steps[0].sync_large_objects: 74
gen[0].steps[1].sync_large_objects: 143
gen[1].steps[0].sync_large_objects: 2805
Thu Jun 2 22:47:50 EDT 2011
----------------------------------------------
The Glorious Glasgow Haskell Compilation System, version 6.12.1
CPSA 2.2.4
Thu Jun 2 22:47:50 EDT 2011
cpsa -o /dev/null kelly64.lisp +RTS -s -N1
125,661,010,200 bytes allocated in the heap
11,585,762,168 bytes copied during GC
40,937,760 bytes maximum residency (170 sample(s))
1,003,736 bytes maximum slop
121 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 239519 collections, 0 parallel, 15.90s, 16.09s elapsed
Generation 1: 170 collections, 0 parallel, 7.82s, 7.89s elapsed
Parallel GC work balance: -nan (0 / 0, ideal 1)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 0.00s (253.67s) 0.00s ( 0.00s)
Task 1 (worker) : 0.00s (253.70s) 0.00s ( 0.00s)
Task 2 (worker) : 253.96s (253.70s) 23.72s ( 23.98s)
SPARKS: 1991 (0 converted, 1991 pruned)
INIT time 0.00s ( 0.00s elapsed)
MUT time 253.42s (253.70s elapsed)
GC time 23.72s ( 23.98s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 277.14s (277.68s elapsed)
%GC time 8.6% (8.6% elapsed)
Alloc rate 495,860,666 bytes per MUT second
Productivity 91.4% of total user, 91.3% of total elapsed
gc_alloc_block_sync: 0
whitehole_spin: 0
gen[0].steps[0].sync_large_objects: 0
gen[0].steps[1].sync_large_objects: 0
gen[1].steps[0].sync_large_objects: 0
cpsa -o /dev/null kelly64.lisp +RTS -s -N2
125,667,947,312 bytes allocated in the heap
12,253,035,976 bytes copied during GC
41,155,360 bytes maximum residency (193 sample(s))
1,374,952 bytes maximum slop
122 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 154214 collections, 154213 parallel, 28.40s, 15.91s elapsed
Generation 1: 193 collections, 193 parallel, 12.46s, 6.41s elapsed
Parallel GC work balance: 1.47 (1529654219 / 1039456343, ideal 2)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 142.55s (153.80s) 30.08s ( 16.07s)
Task 1 (worker) : 160.17s (153.81s) 9.04s ( 4.78s)
Task 2 (worker) : 7.30s (153.81s) 1.74s ( 1.47s)
Task 3 (worker) : 0.00s (153.81s) 0.00s ( 0.00s)
SPARKS: 1991 (1953 converted, 38 pruned)
INIT time 0.00s ( 0.00s elapsed)
MUT time 309.46s (153.81s elapsed)
GC time 40.86s ( 22.32s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 350.32s (176.13s elapsed)
%GC time 11.7% (12.7% elapsed)
Alloc rate 406,087,854 bytes per MUT second
Productivity 88.3% of total user, 175.7% of total elapsed
gc_alloc_block_sync: 898442
whitehole_spin: 1
gen[0].steps[0].sync_large_objects: 5
gen[0].steps[1].sync_large_objects: 0
gen[1].steps[0].sync_large_objects: 260
cpsa -o /dev/null kelly64.lisp +RTS -s -N3
125,670,146,208 bytes allocated in the heap
12,374,600,040 bytes copied during GC
41,302,872 bytes maximum residency (198 sample(s))
1,525,576 bytes maximum slop
123 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 113471 collections, 113470 parallel, 33.09s, 12.51s elapsed
Generation 1: 198 collections, 198 parallel, 15.43s, 5.25s elapsed
Parallel GC work balance: 1.90 (1544740476 / 813661323, ideal 3)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 101.90s (100.47s) 13.11s ( 4.71s)
Task 1 (worker) : 92.97s (100.48s) 23.13s ( 8.18s)
Task 2 (worker) : 6.24s (100.48s) 2.97s ( 1.46s)
Task 3 (worker) : 101.90s (100.48s) 9.31s ( 3.42s)
Task 4 (worker) : 0.00s (100.48s) 0.00s ( 0.00s)
SPARKS: 1991 (1953 converted, 37 pruned)
INIT time 0.01s ( 0.00s elapsed)
MUT time 302.56s (100.48s elapsed)
GC time 48.52s ( 17.77s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 351.09s (118.24s elapsed)
%GC time 13.8% (15.0% elapsed)
Alloc rate 415,342,387 bytes per MUT second
Productivity 86.2% of total user, 255.9% of total elapsed
gc_alloc_block_sync: 1799926
whitehole_spin: 2
gen[0].steps[0].sync_large_objects: 78
gen[0].steps[1].sync_large_objects: 34
gen[1].steps[0].sync_large_objects: 549
cpsa -o /dev/null kelly64.lisp +RTS -s -N4
125,669,729,144 bytes allocated in the heap
12,528,416,664 bytes copied during GC
41,287,728 bytes maximum residency (185 sample(s))
1,641,944 bytes maximum slop
124 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 93003 collections, 93002 parallel, 40.13s, 11.23s elapsed
Generation 1: 185 collections, 185 parallel, 19.97s, 5.08s elapsed
Parallel GC work balance: 2.23 (1563890983 / 701442243, ideal 4)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 65.58s ( 77.91s) 22.08s ( 5.79s)
Task 1 (worker) : 85.68s ( 77.92s) 6.57s ( 1.88s)
Task 2 (worker) : 4.00s ( 77.92s) 4.96s ( 1.49s)
Task 3 (worker) : 0.00s ( 77.92s) 0.00s ( 0.00s)
Task 4 (worker) : 71.12s ( 77.92s) 21.91s ( 5.87s)
Task 5 (worker) : 85.08s ( 77.92s) 4.58s ( 1.28s)
SPARKS: 1991 (1954 converted, 36 pruned)
INIT time 0.01s ( 0.00s elapsed)
MUT time 310.94s ( 77.92s elapsed)
GC time 60.10s ( 16.31s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 371.05s ( 94.23s elapsed)
%GC time 16.2% (17.3% elapsed)
Alloc rate 404,147,705 bytes per MUT second
Productivity 83.8% of total user, 330.0% of total elapsed
gc_alloc_block_sync: 2685598
whitehole_spin: 2
gen[0].steps[0].sync_large_objects: 98
gen[0].steps[1].sync_large_objects: 102
gen[1].steps[0].sync_large_objects: 2745
Thu Jun 2 23:03:34 EDT 2011
----------------------------------------------
The Glorious Glasgow Haskell Compilation System, version 6.12.1
CPSA 2.2.4
Thu Jun 2 23:03:34 EDT 2011
cpsa -o /dev/null kelly64.lisp +RTS -s -N1
125,661,010,200 bytes allocated in the heap
11,585,762,600 bytes copied during GC
40,937,760 bytes maximum residency (170 sample(s))
1,003,736 bytes maximum slop
121 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 239519 collections, 0 parallel, 16.31s, 16.05s elapsed
Generation 1: 170 collections, 0 parallel, 7.78s, 7.77s elapsed
Parallel GC work balance: -nan (0 / 0, ideal 1)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 0.00s (253.66s) 0.00s ( 0.00s)
Task 1 (worker) : 0.00s (253.68s) 0.00s ( 0.00s)
Task 2 (worker) : 253.41s (253.68s) 24.09s ( 23.82s)
SPARKS: 1991 (0 converted, 1991 pruned)
INIT time 0.00s ( 0.00s elapsed)
MUT time 252.99s (253.68s elapsed)
GC time 24.09s ( 23.82s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 277.08s (277.51s elapsed)
%GC time 8.7% (8.6% elapsed)
Alloc rate 496,703,467 bytes per MUT second
Productivity 91.3% of total user, 91.2% of total elapsed
gc_alloc_block_sync: 0
whitehole_spin: 0
gen[0].steps[0].sync_large_objects: 0
gen[0].steps[1].sync_large_objects: 0
gen[1].steps[0].sync_large_objects: 0
cpsa -o /dev/null kelly64.lisp +RTS -s -N2
125,667,010,656 bytes allocated in the heap
12,251,000,176 bytes copied during GC
40,940,416 bytes maximum residency (175 sample(s))
1,362,152 bytes maximum slop
122 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 154526 collections, 154525 parallel, 27.87s, 15.58s elapsed
Generation 1: 175 collections, 175 parallel, 12.60s, 6.42s elapsed
Parallel GC work balance: 1.47 (1529416903 / 1041901461, ideal 2)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 146.12s (153.47s) 23.62s ( 12.47s)
Task 1 (worker) : 156.41s (153.48s) 15.40s ( 8.31s)
Task 2 (worker) : 6.64s (153.48s) 1.45s ( 1.21s)
Task 3 (worker) : 0.00s (153.48s) 0.00s ( 0.00s)
SPARKS: 1991 (1953 converted, 38 pruned)
INIT time 0.01s ( 0.00s elapsed)
MUT time 308.64s (153.48s elapsed)
GC time 40.47s ( 22.00s elapsed)
EXIT time 0.01s ( 0.00s elapsed)
Total time 349.13s (175.48s elapsed)
%GC time 11.6% (12.5% elapsed)
Alloc rate 407,137,337 bytes per MUT second
Productivity 88.4% of total user, 175.9% of total elapsed
gc_alloc_block_sync: 869208
whitehole_spin: 7
gen[0].steps[0].sync_large_objects: 3
gen[0].steps[1].sync_large_objects: 0
gen[1].steps[0].sync_large_objects: 294
cpsa -o /dev/null kelly64.lisp +RTS -s -N3
125,669,661,136 bytes allocated in the heap
12,394,767,632 bytes copied during GC
41,003,136 bytes maximum residency (178 sample(s))
1,480,584 bytes maximum slop
123 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 113661 collections, 113660 parallel, 35.50s, 12.71s elapsed
Generation 1: 178 collections, 178 parallel, 15.85s, 5.34s elapsed
Parallel GC work balance: 1.88 (1547293519 / 824718800, ideal 3)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 96.55s (101.34s) 16.52s ( 5.55s)
Task 1 (worker) : 111.72s (101.35s) 5.02s ( 1.83s)
Task 2 (worker) : 6.75s (101.35s) 3.00s ( 1.37s)
Task 3 (worker) : 88.57s (101.35s) 26.81s ( 9.30s)
Task 4 (worker) : 0.00s (101.35s) 0.00s ( 0.00s)
SPARKS: 1991 (1953 converted, 37 pruned)
INIT time 0.01s ( 0.00s elapsed)
MUT time 302.95s (101.35s elapsed)
GC time 51.35s ( 18.05s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 354.31s (119.40s elapsed)
%GC time 14.5% (15.1% elapsed)
Alloc rate 414,806,116 bytes per MUT second
Productivity 85.5% of total user, 253.7% of total elapsed
gc_alloc_block_sync: 1754074
whitehole_spin: 1
gen[0].steps[0].sync_large_objects: 27
gen[0].steps[1].sync_large_objects: 47
gen[1].steps[0].sync_large_objects: 1323
cpsa -o /dev/null kelly64.lisp +RTS -s -N4
125,670,011,152 bytes allocated in the heap
12,519,788,936 bytes copied during GC
41,279,544 bytes maximum residency (197 sample(s))
1,663,624 bytes maximum slop
124 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 93023 collections, 93022 parallel, 41.03s, 11.18s elapsed
Generation 1: 197 collections, 197 parallel, 20.36s, 5.13s elapsed
Parallel GC work balance: 2.23 (1562821540 / 702327206, ideal 4)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 85.30s ( 77.71s) 6.91s ( 1.89s)
Task 1 (worker) : 68.26s ( 77.73s) 22.92s ( 5.86s)
Task 2 (worker) : 4.37s ( 77.73s) 4.06s ( 1.37s)
Task 3 (worker) : 0.00s ( 77.73s) 0.00s ( 0.00s)
Task 4 (worker) : 66.47s ( 77.73s) 20.76s ( 5.37s)
Task 5 (worker) : 85.60s ( 77.73s) 6.74s ( 1.82s)
SPARKS: 1991 (1953 converted, 38 pruned)
INIT time 0.01s ( 0.00s elapsed)
MUT time 309.58s ( 77.73s elapsed)
GC time 61.39s ( 16.31s elapsed)
EXIT time 0.01s ( 0.00s elapsed)
Total time 370.99s ( 94.04s elapsed)
%GC time 16.5% (17.3% elapsed)
Alloc rate 405,910,888 bytes per MUT second
Productivity 83.4% of total user, 329.2% of total elapsed
gc_alloc_block_sync: 2693504
whitehole_spin: 0
gen[0].steps[0].sync_large_objects: 148
gen[0].steps[1].sync_large_objects: 123
gen[1].steps[0].sync_large_objects: 2519
Thu Jun 2 23:19:17 EDT 2011
----------------------------------------------
The Glorious Glasgow Haskell Compilation System, version 6.12.1
CPSA 2.2.4
Thu Jun 2 23:19:17 EDT 2011
cpsa -o /dev/null kelly64.lisp +RTS -s -N1
125,661,010,200 bytes allocated in the heap
11,585,760,856 bytes copied during GC
40,937,760 bytes maximum residency (170 sample(s))
1,003,736 bytes maximum slop
121 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 239519 collections, 0 parallel, 22.92s, 24.03s elapsed
Generation 1: 170 collections, 0 parallel, 11.04s, 11.05s elapsed
Parallel GC work balance: -nan (0 / 0, ideal 1)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 0.00s (380.46s) 0.00s ( 0.00s)
Task 1 (worker) : 0.00s (380.50s) 0.00s ( 0.00s)
Task 2 (worker) : 381.61s (380.50s) 33.96s ( 35.08s)
SPARKS: 1991 (0 converted, 1991 pruned)
INIT time 0.00s ( 0.00s elapsed)
MUT time 380.97s (380.50s elapsed)
GC time 33.96s ( 35.08s elapsed)
EXIT time 0.01s ( 0.00s elapsed)
Total time 414.94s (415.58s elapsed)
%GC time 8.2% (8.4% elapsed)
Alloc rate 329,836,238 bytes per MUT second
Productivity 91.8% of total user, 91.7% of total elapsed
gc_alloc_block_sync: 0
whitehole_spin: 0
gen[0].steps[0].sync_large_objects: 0
gen[0].steps[1].sync_large_objects: 0
gen[1].steps[0].sync_large_objects: 0
cpsa -o /dev/null kelly64.lisp +RTS -s -N2
125,668,732,328 bytes allocated in the heap
12,255,768,936 bytes copied during GC
41,065,272 bytes maximum residency (181 sample(s))
1,420,480 bytes maximum slop
122 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 154091 collections, 154090 parallel, 28.60s, 15.87s elapsed
Generation 1: 181 collections, 181 parallel, 12.51s, 6.34s elapsed
Parallel GC work balance: 1.47 (1529980296 / 1042613677, ideal 2)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 158.23s (153.34s) 8.99s ( 4.72s)
Task 1 (worker) : 142.30s (153.36s) 30.64s ( 16.19s)
Task 2 (worker) : 8.20s (153.36s) 1.48s ( 1.30s)
Task 3 (worker) : 0.00s (153.36s) 0.00s ( 0.00s)
SPARKS: 1991 (1953 converted, 36 pruned)
INIT time 0.01s ( 0.00s elapsed)
MUT time 308.22s (153.36s elapsed)
GC time 41.11s ( 22.21s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 349.34s (175.57s elapsed)
%GC time 11.8% (12.6% elapsed)
Alloc rate 407,710,905 bytes per MUT second
Productivity 88.2% of total user, 175.6% of total elapsed
gc_alloc_block_sync: 913089
whitehole_spin: 0
gen[0].steps[0].sync_large_objects: 1
gen[0].steps[1].sync_large_objects: 0
gen[1].steps[0].sync_large_objects: 204
cpsa -o /dev/null kelly64.lisp +RTS -s -N3
125,669,925,240 bytes allocated in the heap
12,391,490,688 bytes copied during GC
41,248,216 bytes maximum residency (189 sample(s))
1,517,712 bytes maximum slop
123 MB total memory in use (1 MB lost due to fragmentation)
Generation 0: 114384 collections, 114383 parallel, 36.35s, 13.14s elapsed
Generation 1: 189 collections, 189 parallel, 16.07s, 5.43s elapsed
Parallel GC work balance: 1.88 (1546857855 / 823154735, ideal 3)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 115.29s (104.79s) 6.43s ( 2.23s)
Task 1 (worker) : 91.78s (104.81s) 29.47s ( 10.28s)
Task 2 (worker) : 6.28s (104.81s) 3.07s ( 1.36s)
Task 3 (worker) : 0.00s (104.81s) 0.00s ( 0.00s)
Task 4 (worker) : 101.17s (104.81s) 13.45s ( 4.70s)
SPARKS: 1991 (1954 converted, 35 pruned)
INIT time 0.00s ( 0.00s elapsed)
MUT time 313.88s (104.81s elapsed)
GC time 52.42s ( 18.57s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 366.30s (123.39s elapsed)
%GC time 14.3% (15.1% elapsed)
Alloc rate 400,375,701 bytes per MUT second
Productivity 85.7% of total user, 254.4% of total elapsed
gc_alloc_block_sync: 1717045
whitehole_spin: 3
gen[0].steps[0].sync_large_objects: 37
gen[0].steps[1].sync_large_objects: 22
gen[1].steps[0].sync_large_objects: 594
cpsa -o /dev/null kelly64.lisp +RTS -s -N4
125,669,925,304 bytes allocated in the heap
12,498,196,928 bytes copied during GC
41,260,864 bytes maximum residency (196 sample(s))
1,680,088 bytes maximum slop
124 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 92271 collections, 92270 parallel, 40.02s, 11.31s elapsed
Generation 1: 196 collections, 196 parallel, 19.63s, 4.96s elapsed
Parallel GC work balance: 2.27 (1560094449 / 685875640, ideal 4)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 80.31s ( 77.75s) 6.02s ( 1.62s)
Task 1 (worker) : 71.00s ( 77.78s) 21.00s ( 5.70s)
Task 2 (worker) : 6.27s ( 77.78s) 4.15s ( 1.49s)
Task 3 (worker) : 0.00s ( 77.78s) 0.00s ( 0.00s)
Task 4 (worker) : 70.78s ( 77.78s) 21.90s ( 5.73s)
Task 5 (worker) : 83.12s ( 77.78s) 6.58s ( 1.72s)
SPARKS: 1991 (1953 converted, 37 pruned)
INIT time 0.01s ( 0.00s elapsed)
MUT time 310.84s ( 77.78s elapsed)
GC time 59.65s ( 16.27s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 370.50s ( 94.04s elapsed)
%GC time 16.1% (17.3% elapsed)
Alloc rate 404,278,350 bytes per MUT second
Productivity 83.9% of total user, 330.5% of total elapsed
gc_alloc_block_sync: 2788981
whitehole_spin: 2
gen[0].steps[0].sync_large_objects: 107
gen[0].steps[1].sync_large_objects: 122
gen[1].steps[0].sync_large_objects: 2723
Thu Jun 2 23:39:41 EDT 2011
----------------------------------------------
The Glorious Glasgow Haskell Compilation System, version 6.12.1
CPSA 2.2.4
Thu Jun 2 23:39:41 EDT 2011
cpsa -o /dev/null kelly64.lisp +RTS -s -N1
125,661,010,200 bytes allocated in the heap
11,585,762,312 bytes copied during GC
40,937,760 bytes maximum residency (170 sample(s))
1,003,736 bytes maximum slop
121 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 239519 collections, 0 parallel, 16.33s, 16.10s elapsed
Generation 1: 170 collections, 0 parallel, 7.78s, 7.83s elapsed
Parallel GC work balance: -nan (0 / 0, ideal 1)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 0.00s (253.67s) 0.00s ( 0.00s)
Task 1 (worker) : 0.00s (253.70s) 0.00s ( 0.00s)
Task 2 (worker) : 253.51s (253.70s) 24.11s ( 23.93s)
SPARKS: 1991 (0 converted, 1991 pruned)
INIT time 0.00s ( 0.00s elapsed)
MUT time 253.08s (253.70s elapsed)
GC time 24.11s ( 23.93s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 277.19s (277.62s elapsed)
%GC time 8.7% (8.6% elapsed)
Alloc rate 496,526,830 bytes per MUT second
Productivity 91.3% of total user, 91.2% of total elapsed
gc_alloc_block_sync: 0
whitehole_spin: 0
gen[0].steps[0].sync_large_objects: 0
gen[0].steps[1].sync_large_objects: 0
gen[1].steps[0].sync_large_objects: 0
cpsa -o /dev/null kelly64.lisp +RTS -s -N2
125,667,658,512 bytes allocated in the heap
12,251,490,048 bytes copied during GC
41,056,336 bytes maximum residency (188 sample(s))
1,341,152 bytes maximum slop
122 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 154181 collections, 154180 parallel, 29.33s, 15.69s elapsed
Generation 1: 188 collections, 188 parallel, 12.42s, 6.32s elapsed
Parallel GC work balance: 1.46 (1529460672 / 1046214926, ideal 2)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 149.20s (153.31s) 21.30s ( 11.23s)
Task 1 (worker) : 151.66s (153.32s) 17.74s ( 9.33s)
Task 2 (worker) : 6.78s (153.32s) 2.71s ( 1.45s)
Task 3 (worker) : 0.00s (153.32s) 0.00s ( 0.00s)
SPARKS: 1991 (1953 converted, 38 pruned)
INIT time 0.01s ( 0.00s elapsed)
MUT time 307.01s (153.32s elapsed)
GC time 41.75s ( 22.01s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 348.77s (175.33s elapsed)
%GC time 12.0% (12.6% elapsed)
Alloc rate 409,314,241 bytes per MUT second
Productivity 88.0% of total user, 175.1% of total elapsed
gc_alloc_block_sync: 882949
whitehole_spin: 1
gen[0].steps[0].sync_large_objects: 6
gen[0].steps[1].sync_large_objects: 0
gen[1].steps[0].sync_large_objects: 310
cpsa -o /dev/null kelly64.lisp +RTS -s -N3
125,670,712,176 bytes allocated in the heap
12,384,798,416 bytes copied during GC
41,155,024 bytes maximum residency (204 sample(s))
1,502,328 bytes maximum slop
123 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 114128 collections, 114127 parallel, 35.63s, 12.75s elapsed
Generation 1: 204 collections, 204 parallel, 15.43s, 5.28s elapsed
Parallel GC work balance: 1.89 (1546040714 / 818809508, ideal 3)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 111.97s (102.59s) 7.39s ( 2.51s)
Task 1 (worker) : 97.34s (102.60s) 13.80s ( 4.76s)
Task 2 (worker) : 4.68s (102.60s) 4.82s ( 1.43s)
Task 3 (worker) : 93.43s (102.60s) 25.05s ( 9.33s)
Task 4 (worker) : 0.00s (102.60s) 0.00s ( 0.00s)
SPARKS: 1991 (1954 converted, 36 pruned)
INIT time 0.01s ( 0.00s elapsed)
MUT time 306.93s (102.60s elapsed)
GC time 51.06s ( 18.04s elapsed)
EXIT time 0.01s ( 0.00s elapsed)
Total time 358.01s (120.64s elapsed)
%GC time 14.3% (15.0% elapsed)
Alloc rate 409,417,534 bytes per MUT second
Productivity 85.7% of total user, 254.4% of total elapsed
gc_alloc_block_sync: 1768765
whitehole_spin: 4
gen[0].steps[0].sync_large_objects: 52
gen[0].steps[1].sync_large_objects: 77
gen[1].steps[0].sync_large_objects: 1025
cpsa -o /dev/null kelly64.lisp +RTS -s -N4
125,670,205,280 bytes allocated in the heap
12,519,881,352 bytes copied during GC
41,236,224 bytes maximum residency (179 sample(s))
1,615,944 bytes maximum slop
124 MB total memory in use (2 MB lost due to fragmentation)
Generation 0: 92999 collections, 92998 parallel, 44.05s, 11.33s elapsed
Generation 1: 179 collections, 179 parallel, 20.04s, 5.10s elapsed
Parallel GC work balance: 2.23 (1562831197 / 699552763, ideal 4)
MUT time (elapsed) GC time (elapsed)
Task 0 (worker) : 64.23s ( 77.98s) 24.51s ( 6.08s)
Task 1 (worker) : 84.55s ( 78.02s) 6.41s ( 1.62s)
Task 2 (worker) : 4.04s ( 78.02s) 4.87s ( 1.38s)
Task 3 (worker) : 0.00s ( 78.02s) 0.00s ( 0.00s)
Task 4 (worker) : 68.91s ( 78.02s) 24.03s ( 6.22s)
Task 5 (worker) : 86.53s ( 78.02s) 4.27s ( 1.14s)
SPARKS: 1991 (1954 converted, 35 pruned)
INIT time 0.01s ( 0.00s elapsed)
MUT time 307.69s ( 78.02s elapsed)
GC time 64.09s ( 16.43s elapsed)
EXIT time 0.01s ( 0.00s elapsed)
Total time 371.80s ( 94.45s elapsed)
%GC time 17.2% (17.4% elapsed)
Alloc rate 408,404,683 bytes per MUT second
Productivity 82.8% of total user, 325.8% of total elapsed
gc_alloc_block_sync: 2695907
whitehole_spin: 1
gen[0].steps[0].sync_large_objects: 118
gen[0].steps[1].sync_large_objects: 113
gen[1].steps[0].sync_large_objects: 2485
Thu Jun 2 23:55:26 EDT 2011
----------------------------------------------
-------------- next part --------------
#! /bin/sh
ghc --version
cpsa -v
date
## Swap out other stuff with this:
cpsa -o /dev/null kelly64.lisp
## Real measurement
for i in 1 2 3 4
do
cpsa -o /dev/null kelly64.lisp +RTS -s -N$i
done
date
echo
echo '----------------------------------------------'
echo
More information about the Haskell-Cafe
mailing list