[Haskell-cafe] minimizing jitter while maximizing throughput
Claude Heiland-Allen
claudiusmaximus at goto10.org
Sun Aug 1 02:23:32 EDT 2010
Greeting,
Jitter vs Throughput
====================
Scenario
--------
I have the following scenario:
CPU with [C] cores
concurrent program
the 1 main thread uses OpenGL for animated visual output
[W] worker threads uses FFI to lengthy numerical computations
with the following desires :
(J) minimize jitter : the 1 main thread needs to be responsive
(T) maximize throughput: idle CPU time is a waste of time
The problem is, I want both (J) and (T)!
Benchmarks
----------
Some rough benchmarks from my 'production' implementation[1] where:
jitter = stddev of actual period (target period = 40ms)
idle = C - (real time) / (user+sys time) (cores used elsewhere)
(but for example Xorg will use some time for the OpenGL output, etc)
C W jitter idle
2 0 1.5ms 168% threaded RTS
2 1 1.9ms 65% " "
2 2 9.4ms 46% " "
2 3 13.3ms 37% " "
For comparison, a (very) minimal GLUT program gives:
0.3ms threaded RTS
0.5ms non-threaded RTS
Picking (J) over (T), then setting W=1 when C=2 gives best jitter
Picking (T) over (J), then setting W=2 when C=2 gives best throughput
Question
--------
What is the best way to re-structure the program to have both low jitter
and high throughput?
Options I am considering:
1. worker threads estimate the time needed to complete each job,
and don't start a job if it is likely to break the deadline
(bonus points if just those worker Haskell threads running on
the same OS thread as the main Haskell thread need to pause)
2. change the foreign code to break jobs into smaller pieces,
for example perhaps something like:
worker :: IO (IO Bool) -> IO ()
worker getJob = forever $ do
job <- getJob
whileM_ job yield -- [2]
instead of
worker :: IO (IO ()) -> IO ()
worker = forever . join
3. re-implement the foreign code in Haskell instead of C and hope
that GHC makes the Haskell go as fast as GCC makes the C go
5. wait (for a long time) for a new RTS with:
full pre-emption (including interrupting/resuming foreign code)
user settable thread priorities
(1) is "a fun challenge" (there may be PhDs awarded for less, I imagine)
(2) isn't quite trivial, some scope for tuning subjob chunk size
(3) is boring translation but could lead to interesting benchmarks
even (especially?) if it fails to be as fast as C
Which would you pick?
Links
-----
[1] http://hackage.haskell.org/package/mandulia
[2]
http://hackage.haskell.org/packages/archive/monad-loops/latest/doc/html/Control-Monad-Loops.html#v:whileM_
Thanks for any insight and advice,
Claude
--
http://claudiusmaximus.goto10.org
More information about the Haskell-Cafe
mailing list