[Haskell-cafe] multithreading speedup
Stefan O'Rear
stefanor at cox.net
Fri Apr 13 19:41:46 EDT 2007
On Sat, Apr 14, 2007 at 01:31:58AM +0200, Fawzi Mohamed wrote:
>
> Il giorno Apr 14, 2007, alle ore 12:33 AM, Stefan O'Rear ha scritto:
>
> >On Sat, Apr 14, 2007 at 12:27:10AM +0200, Fawzi Mohamed wrote:
> >>I was trying to speed up a program that I wrote and so I thought
> >>about using multiple threads.
> >>I have a quite easy parallel program and I did the following
> >>
> >>do
> >> subRes <- MVar.newMVar []
> >> putStrLn "starting threads"
> >> subV <- flip mapM [0 .. (nThreads - 1)] $
> >> ( \i -> do
> >> subR <- MVar.newEmptyMVar
> >> let writeRes r = do { MVar.putMVar subR r }
> >> forkOS (do
Getting rid of that forkOS will help a LOT - forkOS is very expensive
and gives no benefit. It exists only for the benefit of FFI libs like
OpenGL that use thread local state.
Plain forkIO uses a thread pool in the runtime; it is even more
parallel than forkOS, and extremely cheap.
The fact that you are using forkOS is a dead giveaway of a
documentation bug somewhere. Please report it, so nobody else makes
this mistake!
> >> let r=eval (startData !! i)
> >> writeRes $! r
> >> putStr $ "writtenRes")
> >> return subR
> >> )
> >> putStrLn "started threads"
> >> toFold <- mapM MVar.takeMVar subV
> >> putStrLn "about to fold"
> >> return $ foldl1 mergeRes toFold
> >>
> >>I know that the threads really calculate what I want, and as soon as
> >>they are finished I get the result.
> >>The problem is that I have no speed up, the time is basically the sum
> >>of the time for the two threads.
> >>I thought that ghc now would take advantage of the two cpus if I
> >>compiled with -threaded.
> >>Am I wrong, do I need some special flag, a newer version of the
> >>compiler (I have 6.6.20070129), or it is just normal?
> >
> >./MyProgram +RTS -N2
> >
> >where N is your CPU count.
>
> thanks, that was it.
>
> >(that said, DO NOT USE THREADS IF AT ALL POSSIBLE, they are ugly and
> >cause heisenbugs, if you want paralelism `par` from Control.Parallel
> >is to be preferred if at all possible since it is deterministic)
>
> in theory yes, but I am quite used to programm with threads and even
> mpi.
> I have looked at Control.Parallel (and the nice article on it), but
> there is no easy way tell (at least that I saw) to leave the whole
> calculation to a sub thread.
> Actually my code should be equivalent to parMap rwhnf, but I get the
> following results:
>
> parMap
> 3.63user 0.02system 0:01.97elapsed 185%CPU (0avgtext+0avgdata
> 0maxresident)k
> 0inputs+0outputs (0major+1039minor)pagefaults 0swaps
>
> threads
> 3.14user 0.02system 0:01.68elapsed 187%CPU (0avgtext+0avgdata
> 0maxresident)k
> 0inputs+0outputs (0major+1041minor)pagefaults 0swaps
Those look equally parallel to me :)
> I suppose that it is because I have a thread for each element in the
> list plus a main thread vs just one thread per element in the list,
> but I am not sure, if someone has some ideas...
>
> With threads (now the I managed to have some speedup) I can use a
> workers/queue approach and have a better load balancing.
Threads are supposed to be cheap. If implementing a thread pool by
hand ever gives a speedup, it's a bug.
> I had looked at strategies, but I need first a breath first traversal
> of a graph generated on the fly (to be parallel) and then a depth
> first traversal (to avoid space leaks), and I found no easy way to do
> it with strategies, so I did it by hand.
>
> by the way is there a way to know how many processors are available
> to the program (to make the strategy or thread control depend on it)?
Stefan
More information about the Haskell-Cafe
mailing list