[Haskell-cafe] Re: Parallel combinator, performance advice

Neil Mitchell ndmitchell at gmail.com
Tue Apr 7 10:13:29 EDT 2009


> > The problem I'm trying to solve is running system commands in
> > parallel.

> "system commands" means execution of external commands or just system
> calls inside Haskell?

Calls to System.Cmd.system, i.e. running external console processes.
It's a make system I'm writing, so virtually all the time is spent in
calls to ghc etc.

To Bulat: I should have been clearer with the spec. The idea is that
multiple calls to paralell_ can execute, and a function executing
inside parallel_ can itself call parallel_. For this reason I need one
top-level thread pool, which requires unsafePerformIO. If I create a
thread pool every time, I end up with more threads than I want.

> Your parallel_ does not return until all operations are finished.
>> parallel_ (x:xs) = do
>>     ys <- mapM idempotent xs
>>     mapM_ addParallel ys
>>     sequence_ $ x : reverse ys
> By the way, there is no obvious reason to insert "reverse" there.

There is a reason :-)

Imagine I do parallel_ [a,b,c]

That's roughly doing (if b' is idempotent b):

enqueue b'
enqueue c'

If while executing a the thread pool starts on b', then after I've
finished a, I end up with both threads waiting for b', and nothing
doing c'. If I do a reverse, then the thread pool and me are starting
at different ends, so if we lock then I know it's something important
to me that the thread pool started first. It's still not idea, but it
happens less often.

> What I meant was something like:
>> para [] = return ()
>> para [x] = x
>> para xs = do
>>   q <- newQSemN 0
>>   let wrap x = finally x (signalQSemN q 1)
>>       go [y] n = wrap x >> waitQSemN q (succ n)
>>       go (y:ys) n = addParallel (wrap y) >> go ys $! succ n
>>   go xs 0
> This is nearly identical to your code, and avoid creating the MVar for each
> operation.  I use "finally" to ensure the count is correct, but if a worker
> threads dies then bas things will happen.  You can replace finally with (>>) if
> speed is important.

Consider a thread pool with 2 threads and the call parallel_ [parallel_ [b,c],a]

You get the sequence:
enqueue (parallel_ [b,c])
wait on parallel_ [b,c]

While you are executing a, a thread pool starts:
enqueue b
wait for b

Now you have all the threads waiting, and no one dealing with the
thread pool. This results in deadlock.

I guess the "nested calls to parallel_" bit is the part of the spec
that makes everything much harder!



More information about the Haskell-Cafe mailing list