[Haskell-cafe] Re: Parallel combinator, performance advice
Neil Mitchell
ndmitchell at gmail.com
Tue Apr 7 10:13:29 EDT 2009
Hi
> > The problem I'm trying to solve is running system commands in
> > parallel.
> "system commands" means execution of external commands or just system
> calls inside Haskell?
Calls to System.Cmd.system, i.e. running external console processes.
It's a make system I'm writing, so virtually all the time is spent in
calls to ghc etc.
To Bulat: I should have been clearer with the spec. The idea is that
multiple calls to paralell_ can execute, and a function executing
inside parallel_ can itself call parallel_. For this reason I need one
top-level thread pool, which requires unsafePerformIO. If I create a
thread pool every time, I end up with more threads than I want.
> Your parallel_ does not return until all operations are finished.
>
>> parallel_ (x:xs) = do
>> ys <- mapM idempotent xs
>> mapM_ addParallel ys
>> sequence_ $ x : reverse ys
>
> By the way, there is no obvious reason to insert "reverse" there.
There is a reason :-)
Imagine I do parallel_ [a,b,c]
That's roughly doing (if b' is idempotent b):
enqueue b'
enqueue c'
a
b'
c'
If while executing a the thread pool starts on b', then after I've
finished a, I end up with both threads waiting for b', and nothing
doing c'. If I do a reverse, then the thread pool and me are starting
at different ends, so if we lock then I know it's something important
to me that the thread pool started first. It's still not idea, but it
happens less often.
> What I meant was something like:
>>
>> para [] = return ()
>> para [x] = x
>> para xs = do
>> q <- newQSemN 0
>> let wrap x = finally x (signalQSemN q 1)
>> go [y] n = wrap x >> waitQSemN q (succ n)
>> go (y:ys) n = addParallel (wrap y) >> go ys $! succ n
>> go xs 0
>
> This is nearly identical to your code, and avoid creating the MVar for each
> operation. I use "finally" to ensure the count is correct, but if a worker
> threads dies then bas things will happen. You can replace finally with (>>) if
> speed is important.
Consider a thread pool with 2 threads and the call parallel_ [parallel_ [b,c],a]
You get the sequence:
enqueue (parallel_ [b,c])
a
wait on parallel_ [b,c]
While you are executing a, a thread pool starts:
enqueue b
c
wait for b
Now you have all the threads waiting, and no one dealing with the
thread pool. This results in deadlock.
I guess the "nested calls to parallel_" bit is the part of the spec
that makes everything much harder!
Thanks
Neil
More information about the Haskell-Cafe
mailing list