[Haskell-cafe] You are in a twisty maze of concurrency libraries, all different ...

Fri Dec 4 07:42:14 EST 2009

On Fri, Dec 4, 2009 at 12:28 PM, Patrick Caldon <patc at pessce.net> wrote:

> Neil Brown wrote:
>
>> Patrick Caldon wrote:
>>
>>>
>>> I'm looking for the "right" concurrency library/semantics for what should
>>> be a reasonably simple problem.
>>>
>>> I have a little simulator:
>>>
>>> runWorldSim :: MTGen -> SimState -> IO SimState
>>>
>>> it takes about a second to run on a PC. It's functional except it whacks
>>> the rng, which needs IO. I run 5-10 of these jobs, and then use:
>>>
>>> mergeWorld :: [SimState] -> SimState
>>>
>>> to pick the best features of the runs and build another possible world
>>> (state).  Then I use this new world to run another 5-10 jobs and so on.  I
>>> run this through ~20000 iterations.
>>>
>>> It's an obvious place for parallelism.
>>>
>>> I'm looking for a concurrency library with something like:
>>>
>>> forkSequence :: Int -> [IO a] -> IO [a]
>>>
>>> which I could call with something like this:
>>>
>>> forkSequence 4 (take 10 (repeat  (runWorldSim g ss)))
>>>
>>> this would construct 4 threads, then dispatch the 10 jobs onto the
>>> threads, and pack up the
>>> results into a list I could run through my merger.
>>>
>> Why particularly do you want to run the 10 jobs on 4 threads?  Haskell's
>> run-time is quite good at spreading out the lightweight threads onto all
>> your cores, so the easiest thing to do is run the 10 jobs on 10
>> (light-weight) threads and let the run-time sort out the rest.
>>
>
> Thanks so much for that! I'll give it a go.
>
> Different threads is just because some of the jobs are memory hogs, and I
> want to minimize the number running simultaneously.  I'll see what happens
> with a runPar-like approach, and use a queue-based approach if it becomes a
> problem.
>
>  So if what you want is a function:
>>
>> runPar :: [IO a] -> IO [a]
>>
>> you can easily construct this.  Shameless plug: my CHP library effectively
>> has this function already, runParallel :: [CHP a] -> CHP [a] (CHP being a
>> slight layer on top of IO).  But you can do it just as easily with, say,
>> STM.  Here is a version where order doesn't matter (apologies for the
>> point-free style):
>>
>> import Control.Concurrent
>> import Control.Concurrent.STM
>> import Control.Monad
>>
>> modifyTVar :: TVar a -> (a -> a) -> STM ()
>> modifyTVar tv f = readTVar tv >>= writeTVar tv . f
>>
>> runPar :: [IO a] -> IO [a]
>> runPar ps
>>  = do resVar <- newTVarIO []
>>      mapM_ (forkIO . (>>= atomically . modifyTVar resVar . (:))) ps
>>      atomically $ do res <- readTVar resVar
>>                      when (length res < length ps) retry
>>                      return res
>>
>> If order does matter, you can zip the results with an index, and sort by
>> the index afterwards.  If efficiency matters, you can perform other tweaks.
>>  But the principle is quite straightforward.  Or you can refactor your code
>> to take the IO dependency out of your random number generation, and run the
>> sets of pure code in parallel using the parallel library.  If all you are
>> using IO for is random numbers, that's probably the nicest approach.
>>
>>  Good, fast random numbers are unfortunately necessary - I had a nice
> implementation using System.Random, but had to rewrite it because
> performance was poor :( .

Have you tried this, pure, library?
http://hackage.haskell.org/package/mersenne-random-pure64

<http://hackage.haskell.org/package/mersenne-random-pure64>

-- 
Sebastian Sylvan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/haskell-cafe/attachments/20091204/95d41bbc/attachment.html