[GHC] #7606: Stride scheduling for Haskell threads with priorities

Tue Feb 5 03:06:21 CET 2013

#7606: Stride scheduling for Haskell threads with priorities
---------------------------------+------------------------------------------
    Reporter:  ezyang            |       Owner:  ezyang          
        Type:  feature request   |      Status:  new             
    Priority:  normal            |   Milestone:  7.8.1           
   Component:  Runtime System    |     Version:  7.7             
    Keywords:                    |          Os:  Unknown/Multiple
Architecture:  Unknown/Multiple  |     Failure:  None/Unknown    
  Difficulty:  Unknown           |    Testcase:                  
   Blockedby:                    |    Blocking:                  
     Related:                    |  
---------------------------------+------------------------------------------

Comment(by ezyang):

 OK, here's a new progress report.

 The red-black tree implementation is now behaving very consistently and
 with only 1-3% overhead. (Right now, it uses a malloc'd free list of red-
 black tree nodes; this ended up being quicker than using heap allocated
 stuff.)  Unfortunately, sleeper fairness (with the objective of giving
 sleepy threads a bit of a boost) makes all of the benchmarks perform
 worse; in part due to the fact that sleepy threads increase the overhead
 of managing processes, since they tend to have different priorities.  I
 don't at the moment have a good way of measuring latency change, see #7659
 for a bug related to that.  Additionally, the impact of scheduling is
 fairly hard on some test programs I’ve constructed unless you insert
 explicit yields into your programs. One more worse thing is that the the
 rb-trees are synthetically constructed to get a performance boost when all
 threads have the same priority; when threads have different priorities you
 will immediately see a slowdown because more rb-tree manipulations will
 become necessary.)

 The array heap implementation has lower overhead (0-2%) but is far more
 inconsistent; some benchmarks do much worse. Right now it’s still pretty
 unclear where all of this overhead is coming from.

 The end conclusion is as follows: if red-black trees end up being the
 dominant implementation, then there is still a chance that we can replace
 the current scheduler wholesale, since its runtime characteristics are
 very similar to the current scheduler. (Sleeper fairness will need to
 continue to be a configurable option, assuming we can prove that it
 improves latency.) However, if array heaps end up having much better
 properties on prioritized workloads (test-cases of which I have not
 constructed yet), I find it highly implausible that we’ll be able to
 finesse the implementation to work identically to the current scheduler,
 and we will need to support both. On the plus side, because I have made so
 many implementations I have a good sense refactor the API properly.

 I’ll attach some patches, but the comments still need some cleaning up.

 rbtrees:

 {{{
 --------------------------------------------------------------------------------
         Program           Size    Allocs   Runtime   Elapsed  TotalMem
 --------------------------------------------------------------------------------
     callback001          +0.2%     +0.0%     +2.2%     +1.9%     -2.0%
     callback002          +0.2%     +0.0%     +1.7%     +1.6%     +0.0%
            chan          +0.2%     +0.0%     +2.1%     +2.1%     +0.6%
           sieve          +0.2%     +0.0%     +2.5%     +2.6%     +0.0%
      threads001          +0.2%     +0.0%     +1.7%     +1.5%     +0.0%
      threads003          +0.2%     +0.0%     -8.9%     -8.8%    +12.7%
      threads006          +0.2%     +0.0%     +1.6%     +0.8%     -3.2%
      threads007          +0.2%     -0.0%     +2.9%     +3.0%     -0.0%
 --------------------------------------------------------------------------------
             Min          +0.2%     -0.0%     -8.9%     -8.8%     -3.2%
             Max          +0.2%     +0.0%     +2.9%     +3.0%    +12.7%
  Geometric Mean          +0.2%     +0.0%     +0.7%     +0.5%     +0.9%
 }}}

 heap arrays (using emulated ordering in order to recreate thread
 conditions)

 {{{
 --------------------------------------------------------------------------------
         Program           Size    Allocs   Runtime   Elapsed  TotalMem
 --------------------------------------------------------------------------------
     callback001          +0.1%     +0.0%     +2.1%     +2.0%     -2.0%
     callback002          +0.1%     +0.0%     +1.3%     +1.3%     +0.0%
            chan          +0.1%     +0.0%     +0.5%     +0.4%     +0.0%
           sieve          +0.1%     +0.0%     -0.7%     -0.7%     +0.0%
      threads001          +0.1%     +0.0%     -0.6%     -0.7%     +0.0%
      threads003          +0.1%     +0.0%    +62.0%    +62.4%     -3.2%
      threads006          +0.1%     +0.0%     +6.0%     +5.8%     -3.2%
      threads007          +0.1%     +0.0%     +6.7%     +6.8%     -8.3%
 --------------------------------------------------------------------------------
             Min          +0.1%     +0.0%     -0.7%     -0.7%     -8.3%
             Max          +0.1%     +0.0%    +62.0%    +62.4%     +0.0%
  Geometric Mean          +0.1%     +0.0%     +8.2%     +8.2%     -2.1%
 }}}

-- 
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/7606#comment:27>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler