Advice wanted on parallel processing

Colin Paul Adams colin at colina.demon.co.uk
Thu Mar 19 02:46:06 EDT 2009


>>>>> "Brandon" == Brandon S Allbery KF8NH <allbery at ece.cmu.edu> writes:

    Brandon> On 2009 Mar 18, at 16:59, Colin Paul Adams wrote:
    >>>>>>> "Brandon" == Brandon S Allbery KF8NH <allbery at ece.cmu.edu>
    >>>>>>> writes:
    >> 
    >>>> The array has 12 elements.
    >> 
    Brandon> How many times do you call it?  Perhaps the real
    Brandon> optimization you need is to memoize.
    >> 
    >> Very many times indeed. But it is a different array on most
    >> occasions (I am not updating in place).
    >> 
    >> It represents one rank of the board on which the game is
    >> played. The move generator produces a new board for each move.


    Brandon> It might be helpful for you to show more of the program.

I'm not sure which bits to show that might be revealing.
The whole thing can be seen by 

darcs pull http://code.haskell.org/srv/code/chu-shogi/

if you are particularly interested. Board.hs is where rank_value is to
be found. You would also need

darcs pull http://code.haskell.org/srv/code/game-tree/

    Brandon> One thing to keep in mind about parallelization is that
    Brandon> the level at which happens matters: if individual
    Brandon> calculations across the list are fast, all you gain by

Profiling shows that nearly all the time is spent in evaluating the
position.
I'm doing that (at the moment) by evaluating each square in isolation
(that might change in the future), then summing each rank, and then
summing the ranks.

    Brandon> parallelizing it is MP synchronization/locking overhead.
    Brandon> On the other hand, it's possible that if the caller can
    Brandon> be rearranged so that rank_value is computed on another
    Brandon> CPU while it's doing something else, you could gain quite
    Brandon> a bit.  Or maybe the caller is itself at the right level
    Brandon> to parallelize.

So at first glance it looked as if the calculation for each rank (and
indeed each cell) should be able to proceed completly independently of
each other.

But the profile shows no time allocated to the children of rank_value
(although cell_value, which is called by it, uses a lot of time). So
my intution seems defeated. perhaps by code-rewriting, although I
can't imagine how.
-- 
Colin Adams
Preston Lancashire


More information about the Glasgow-haskell-users mailing list