[Haskell-cafe] You are in a twisty maze of concurrency libraries, all different ...

Fri Dec 4 07:07:12 EST 2009

Patrick Caldon wrote:
>
> I'm looking for the "right" concurrency library/semantics for what 
> should be a reasonably simple problem.
>
> I have a little simulator:
>
> runWorldSim :: MTGen -> SimState -> IO SimState
>
> it takes about a second to run on a PC. It's functional except it 
> whacks the rng, which needs IO. I run 5-10 of these jobs, and then use:
>
> mergeWorld :: [SimState] -> SimState
>
> to pick the best features of the runs and build another possible world 
> (state).  Then I use this new world to run another 5-10 jobs and so 
> on.  I run this through ~20000 iterations.
>
> It's an obvious place for parallelism.
>
> I'm looking for a concurrency library with something like:
>
> forkSequence :: Int -> [IO a] -> IO [a]
>
> which I could call with something like this:
>
> forkSequence 4 (take 10 (repeat  (runWorldSim g ss)))
>
> this would construct 4 threads, then dispatch the 10 jobs onto the 
> threads, and pack up the
> results into a list I could run through my merger.
Why particularly do you want to run the 10 jobs on 4 threads?  Haskell's 
run-time is quite good at spreading out the lightweight threads onto all 
your cores, so the easiest thing to do is run the 10 jobs on 10 
(light-weight) threads and let the run-time sort out the rest.  So if 
what you want is a function:

runPar :: [IO a] -> IO [a]

you can easily construct this.  Shameless plug: my CHP library 
effectively has this function already, runParallel :: [CHP a] -> CHP [a] 
(CHP being a slight layer on top of IO).  But you can do it just as 
easily with, say, STM.  Here is a version where order doesn't matter 
(apologies for the point-free style):

import Control.Concurrent
import Control.Concurrent.STM
import Control.Monad

modifyTVar :: TVar a -> (a -> a) -> STM ()
modifyTVar tv f = readTVar tv >>= writeTVar tv . f

runPar :: [IO a] -> IO [a]
runPar ps
  = do resVar <- newTVarIO []
       mapM_ (forkIO . (>>= atomically . modifyTVar resVar . (:))) ps
       atomically $ do res <- readTVar resVar
                       when (length res < length ps) retry
                       return res

If order does matter, you can zip the results with an index, and sort by 
the index afterwards.  If efficiency matters, you can perform other 
tweaks.  But the principle is quite straightforward.  Or you can 
refactor your code to take the IO dependency out of your random number 
generation, and run the sets of pure code in parallel using the parallel 
library.  If all you are using IO for is random numbers, that's probably 
the nicest approach.

Thanks,

Neil.

P.S. take 10 . repeat is the same as replicate 10