DPH, granularity, and GPUs
Vivian McPhail
haskell.vivian.mcphail at gmail.com
Wed Sep 29 05:30:10 EDT 2010
Hello,
DPH seems to build parallel vectors at the level of scalar elements
(doubles, say). Is this a design decision aimed at targettiing GPUs? If I
am filtering an hour's worth of multichannel data (an array of (Vector
Double)) then off the top of my head I would think that the optimal
efficiency would be achieved on n cpu cores with each core filtering one
channel, rather than trying to do anything fancy with processing vectors in
parallel.
I say this because filters (which could be assembled from arrow structures)
feedback across (regions of) a vector. Do GPUs have some sort of shift
operation optimisation? In other words, if I have a (constant) matrix A, my
filter, and a datastream, x, where x_i(t+1) = x_{i-1}(t), can a GPU perform
Ax in O(length(x))?
Otherwise, given the cost of moving data to and from the GPU, I would guess
that one sequential algorithm per core is faster (Concurrent Haskell) and
that there is a granularity barrier.
Cheers,
Vivian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/glasgow-haskell-users/attachments/20100929/26466d86/attachment.html
More information about the Glasgow-haskell-users
mailing list