[Haskell-cafe] Understanding GC time

Thiago Negri evohunz at gmail.com
Sat Mar 10 16:21:21 CET 2012


Hi all.

I wrote a very simple program to try out parallel Haskel and check how
it would look like to make use of more than one core in this language.

When I tried the program with RTS option -N1, total time shows it took
2.48 seconds to complete and around 65% of that time was taken by GC.

Then I tried the same program with RTS options -N2 and total time
decreased to 1.15 seconds as I expected a gain here. But what I didn't
expect is the GC time to drop to 0%.

I guess I'm having trouble to understand the output of the RTS option -s.
Can you enlighten me?


The source for the testing program:

> module Main where
>
> import Data.List (foldl1')
> import Control.Parallel (par, pseq)
> import Control.Arrow ((&&&))
>
> f `parApp` (a, b) = a `par` (b `pseq` (f a b))
> seqApp = uncurry
>
> main = print result
>   where result = (+) `parApp` minMax list
>         minMax = minlist &&& maxlist
>         minlist = foldl1' min
>         maxlist = foldl1' max
>         list = [1..19999999]


The results on a Windows 7 64bits with an Intel Core 2 Duo, compiled
with GHC from Haskell Platform:

c:\tmp\hs>par +RTS -s -N1
par +RTS -s -N1
20000000
     803,186,152 bytes allocated in the heap
     859,916,960 bytes copied during GC
     233,465,740 bytes maximum residency (10 sample(s))
      30,065,860 bytes maximum slop
             483 MB total memory in use (0 MB lost due to fragmentation)

  Generation 0:  1523 collections,     0 parallel,  0.80s,  0.75s elapsed
  Generation 1:    10 collections,     0 parallel,  0.83s,  0.99s elapsed

  Parallel GC work balance: nan (0 / 0, ideal 1)

                        MUT time (elapsed)       GC time  (elapsed)
  Task  0 (worker) :    0.00s    (  0.90s)       0.00s    (  0.06s)
  Task  1 (worker) :    0.00s    (  0.90s)       0.00s    (  0.00s)
  Task  2 (bound)  :    0.86s    (  0.90s)       1.62s    (  1.69s)

  SPARKS: 1 (0 converted, 0 pruned)

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time    0.86s  (  0.90s elapsed)
  GC    time    1.62s  (  1.74s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time    2.48s  (  2.65s elapsed)

  %GC time      65.4%  (65.9% elapsed)

  Alloc rate    936,110,032 bytes per MUT second

  Productivity  34.6% of total user, 32.4% of total elapsed

gc_alloc_block_sync: 0
whitehole_spin: 0
gen[0].sync_large_objects: 0
gen[1].sync_large_objects: 0


c:\tmp\hs>par +RTS -s -N2
par +RTS -s -N2
20000000
   1,606,279,644 bytes allocated in the heap
          74,924 bytes copied during GC
          28,340 bytes maximum residency (1 sample(s))
          29,004 bytes maximum slop
               2 MB total memory in use (0 MB lost due to fragmentation)

  Generation 0:  1566 collections,  1565 parallel,  0.00s,  0.01s elapsed
  Generation 1:     1 collections,     1 parallel,  0.00s,  0.00s elapsed

  Parallel GC work balance: 1.78 (15495 / 8703, ideal 2)

                        MUT time (elapsed)       GC time  (elapsed)
  Task  0 (worker) :    0.00s    (  0.59s)       0.00s    (  0.00s)
  Task  1 (worker) :    0.58s    (  0.59s)       0.00s    (  0.01s)
  Task  2 (bound)  :    0.58s    (  0.59s)       0.00s    (  0.00s)
  Task  3 (worker) :    0.00s    (  0.59s)       0.00s    (  0.00s)

  SPARKS: 1 (1 converted, 0 pruned)

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time    1.15s  (  0.59s elapsed)
  GC    time    0.00s  (  0.01s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time    1.15s  (  0.61s elapsed)

  %GC time       0.0%  (2.4% elapsed)

  Alloc rate    1,391,432,695 bytes per MUT second

  Productivity 100.0% of total user, 190.3% of total elapsed

gc_alloc_block_sync: 90
whitehole_spin: 0
gen[0].sync_large_objects: 0
gen[1].sync_large_objects: 0



More information about the Haskell-Cafe mailing list