[Haskell-cafe] Understanding GC time
Yves Parès
yves.pares at gmail.com
Sat Mar 10 21:23:52 CET 2012
I'm afraid without uniqueness or linear typing it would be hard to avoid.
2012/3/10 Thiago Negri <evohunz at gmail.com>
> I see. Thanks for the answers.
>
> Any data structure or source annotation that would prevent that?
>
> For example, if I try the same program to run on a
> [1..9999999999999999] list, I'll get an out of memory error for the
> single-threaded version. Any way to prevent it without declaring two
> different versions of "list"?
>
>
> 2012/3/10 Anthony Cowley <acowley at gmail.com>:
> > From that profiling data, I think you're just seeing a decrease in
> sharing. With one thread, you create the list structure in memory: the
> first fold could consume it in-place, but the second fold is still waiting
> for its turn. The list is built on the heap so the two folds can both
> refer to the same list.
> >
> > With two threads, GHC is being clever and inlining the definition you
> give for list, which is then optimized into two parallel loops. No list on
> the heap means there's not much for the GC to do.
> >
> > Sharing of index lists like this is a common source of problems. In
> particular, nested loops can make it even trickier to prevent sharing as
> there may not be an opportunity for parallel evaluation.
> >
> > Anthony
> >
> > On Mar 10, 2012, at 10:21 AM, Thiago Negri <evohunz at gmail.com> wrote:
> >
> >> Hi all.
> >>
> >> I wrote a very simple program to try out parallel Haskel and check how
> >> it would look like to make use of more than one core in this language.
> >>
> >> When I tried the program with RTS option -N1, total time shows it took
> >> 2.48 seconds to complete and around 65% of that time was taken by GC.
> >>
> >> Then I tried the same program with RTS options -N2 and total time
> >> decreased to 1.15 seconds as I expected a gain here. But what I didn't
> >> expect is the GC time to drop to 0%.
> >>
> >> I guess I'm having trouble to understand the output of the RTS option
> -s.
> >> Can you enlighten me?
> >>
> >>
> >> The source for the testing program:
> >>
> >>> module Main where
> >>>
> >>> import Data.List (foldl1')
> >>> import Control.Parallel (par, pseq)
> >>> import Control.Arrow ((&&&))
> >>>
> >>> f `parApp` (a, b) = a `par` (b `pseq` (f a b))
> >>> seqApp = uncurry
> >>>
> >>> main = print result
> >>> where result = (+) `parApp` minMax list
> >>> minMax = minlist &&& maxlist
> >>> minlist = foldl1' min
> >>> maxlist = foldl1' max
> >>> list = [1..19999999]
> >>
> >>
> >> The results on a Windows 7 64bits with an Intel Core 2 Duo, compiled
> >> with GHC from Haskell Platform:
> >>
> >> c:\tmp\hs>par +RTS -s -N1
> >> par +RTS -s -N1
> >> 20000000
> >> 803,186,152 bytes allocated in the heap
> >> 859,916,960 bytes copied during GC
> >> 233,465,740 bytes maximum residency (10 sample(s))
> >> 30,065,860 bytes maximum slop
> >> 483 MB total memory in use (0 MB lost due to fragmentation)
> >>
> >> Generation 0: 1523 collections, 0 parallel, 0.80s, 0.75s elapsed
> >> Generation 1: 10 collections, 0 parallel, 0.83s, 0.99s elapsed
> >>
> >> Parallel GC work balance: nan (0 / 0, ideal 1)
> >>
> >> MUT time (elapsed) GC time (elapsed)
> >> Task 0 (worker) : 0.00s ( 0.90s) 0.00s ( 0.06s)
> >> Task 1 (worker) : 0.00s ( 0.90s) 0.00s ( 0.00s)
> >> Task 2 (bound) : 0.86s ( 0.90s) 1.62s ( 1.69s)
> >>
> >> SPARKS: 1 (0 converted, 0 pruned)
> >>
> >> INIT time 0.00s ( 0.00s elapsed)
> >> MUT time 0.86s ( 0.90s elapsed)
> >> GC time 1.62s ( 1.74s elapsed)
> >> EXIT time 0.00s ( 0.00s elapsed)
> >> Total time 2.48s ( 2.65s elapsed)
> >>
> >> %GC time 65.4% (65.9% elapsed)
> >>
> >> Alloc rate 936,110,032 bytes per MUT second
> >>
> >> Productivity 34.6% of total user, 32.4% of total elapsed
> >>
> >> gc_alloc_block_sync: 0
> >> whitehole_spin: 0
> >> gen[0].sync_large_objects: 0
> >> gen[1].sync_large_objects: 0
> >>
> >>
> >> c:\tmp\hs>par +RTS -s -N2
> >> par +RTS -s -N2
> >> 20000000
> >> 1,606,279,644 bytes allocated in the heap
> >> 74,924 bytes copied during GC
> >> 28,340 bytes maximum residency (1 sample(s))
> >> 29,004 bytes maximum slop
> >> 2 MB total memory in use (0 MB lost due to fragmentation)
> >>
> >> Generation 0: 1566 collections, 1565 parallel, 0.00s, 0.01s elapsed
> >> Generation 1: 1 collections, 1 parallel, 0.00s, 0.00s elapsed
> >>
> >> Parallel GC work balance: 1.78 (15495 / 8703, ideal 2)
> >>
> >> MUT time (elapsed) GC time (elapsed)
> >> Task 0 (worker) : 0.00s ( 0.59s) 0.00s ( 0.00s)
> >> Task 1 (worker) : 0.58s ( 0.59s) 0.00s ( 0.01s)
> >> Task 2 (bound) : 0.58s ( 0.59s) 0.00s ( 0.00s)
> >> Task 3 (worker) : 0.00s ( 0.59s) 0.00s ( 0.00s)
> >>
> >> SPARKS: 1 (1 converted, 0 pruned)
> >>
> >> INIT time 0.00s ( 0.00s elapsed)
> >> MUT time 1.15s ( 0.59s elapsed)
> >> GC time 0.00s ( 0.01s elapsed)
> >> EXIT time 0.00s ( 0.00s elapsed)
> >> Total time 1.15s ( 0.61s elapsed)
> >>
> >> %GC time 0.0% (2.4% elapsed)
> >>
> >> Alloc rate 1,391,432,695 bytes per MUT second
> >>
> >> Productivity 100.0% of total user, 190.3% of total elapsed
> >>
> >> gc_alloc_block_sync: 90
> >> whitehole_spin: 0
> >> gen[0].sync_large_objects: 0
> >> gen[1].sync_large_objects: 0
> >>
> >> _______________________________________________
> >> Haskell-Cafe mailing list
> >> Haskell-Cafe at haskell.org
> >> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20120310/6c0aa7aa/attachment.htm>
More information about the Haskell-Cafe
mailing list