DPH, granularity, and GPUs

Wed Sep 29 05:30:10 EDT 2010

Hello,

DPH seems to build parallel vectors at the level of scalar elements
(doubles, say).  Is this a design decision aimed at targettiing GPUs?  If I
am filtering an hour's worth of multichannel data (an array of (Vector
Double)) then off the top of my head I would think that the optimal
efficiency would be achieved on n cpu cores with each core filtering one
channel, rather than trying to do anything fancy with processing vectors in
parallel.

I say this because filters (which could be assembled from arrow structures)
feedback across (regions of) a vector.  Do GPUs have some sort of shift
operation optimisation?  In other words, if I have a (constant) matrix A, my
filter, and a datastream, x, where x_i(t+1) = x_{i-1}(t), can a GPU perform
Ax in O(length(x))?

Otherwise, given the cost of moving data to and from the GPU, I would guess
that one sequential algorithm per core is faster (Concurrent Haskell) and
that there is a granularity barrier.

Cheers,

Vivian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/glasgow-haskell-users/attachments/20100929/26466d86/attachment.html