[Haskell-cafe] Benchmarking and Garbage Collection

Neil Brown nccb2 at kent.ac.uk
Thu Mar 4 17:01:54 EST 2010


Jesper Louis Andersen wrote:
> On Thu, Mar 4, 2010 at 8:35 PM, Neil Brown <nccb2 at kent.ac.uk> wrote:
>   
>> CML is indeed the library that has the most markedly different behaviour.
>>  In Haskell, the CML package manages to produce timings like this for fairly
>> simple benchmarks:
>>
>>  %GC time      96.3%  (96.0% elapsed)
>>
>> I knew from reading the code that CML's implementation would do something
>> like this, although I do wonder if it triggers some pathological case in the
>> GC.
>>     
>
> That result is peculiar. What are you doing to the library, and what
> do you expect happens? Since I have some code invested on top of CML,
> I'd like to gain a little insight if possible.
>   

In trying to simplify my code, the added time has moved from GC time to 
EXIT time (and increased!).  This shift isn't too surprising -- I 
believe the time is really spent trying to kill lots of threads.  Here's 
my very simple benchmark; the main thread repeatedly chooses between 
receiving from two threads that are sending to it:

====
import Control.Concurrent
import Control.Concurrent.CML
import Control.Monad

main :: IO ()
main = do let numChoices = 2
          cs <- replicateM numChoices channel
          mapM_ forkIO [replicateM_ (100000 `div` numChoices) $ sync $ 
transmit c () | c <- cs]
          replicateM_ 100000 $ sync $ choose [receive c (const True) | c 
<- cs]
====

Compiling with -threaded, and running with +RTS -s, I get:

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time    2.68s  (  3.56s elapsed)
  GC    time    1.84s  (  1.90s elapsed)
  EXIT  time   89.30s  ( 90.71s elapsed)
  Total time   93.82s  ( 96.15s elapsed)

I think the issue with the CML library is that it spawns a lot of 
threads (search the source for forkIO: 
http://hackage.haskell.org/packages/archive/cml/0.1.3/doc/html/src/Control-Concurrent-CML.html).  
Presumably the Haskell RTS isn't optimised for this approach (maybe the 
ML RTS was, from what you said?), and at the end of the program it 
spends a lot of time reaping the threads.  The problem isn't nearly as 
bad if you don't use choose, though:

====
import Control.Concurrent
import Control.Concurrent.CML
import Control.Monad

main :: IO ()
main = do c <- channel
          forkIO $ replicateM_ 100000 $ sync $ transmit c ()
          replicateM_ 100000 $ sync $ receive c (const True)
====

I get:

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time    1.92s  (  2.65s elapsed)
  GC    time    0.92s  (  0.93s elapsed)
  EXIT  time    0.00s  (  0.02s elapsed)
  Total time    2.65s  (  3.59s elapsed)

  %GC time      34.6%  (25.8% elapsed)


Hope that helps,

Neil.


More information about the Haskell-Cafe mailing list