[Haskell-cafe] Naive matrix multiplication with Accelerate

Clark Gaebel cgaebel at uwaterloo.ca
Mon Dec 3 07:07:49 CET 2012


Hello cafe,

I've recently started learning about cuda and hetrogenous programming, and
have been using accelerate [1] to help me out. Right now, I'm running into
trouble in that I can't call parallel code from sequential code. Turns out
GPUs aren't exactly like Repa =P.

Here's what I have so far:

import qualified Data.Array.Accelerate as A
import Data.Array.Accelerate ( (:.)(..)
                             , Acc
                             , Vector
                             , Scalar
                             , Elt
                             , fold
                             , slice
                             , constant
                             , Array
                             , Z(..), DIM1, DIM2
                             , fromList
                             , All(..)
                             , generate
                             , lift, unlift
                             , shape
                             )
import Data.Array.Accelerate.Interpreter ( run )

dotP :: (Num a, Elt a) => Acc (Vector a) -> Acc (Vector a) -> Acc (Scalar a)
dotP xs ys = fold (+) 0 $ A.zipWith (*) xs ys

type Matrix a = Array DIM2 a

getRow :: Elt a => Int -> Acc (Matrix a) -> Acc (Vector a)
getRow n mat = slice mat . constant $ Z :. n :. All

-- Naive matrix multiplication:
--
-- index (i, j) is equal to the ith row of 'a' `dot` the jth row of 'b'
matMul :: A.Acc (Matrix Double) -> A.Acc (Matrix Double) -> A.Acc (Matrix
Double)
matMul a b' = A.generate (constant $ Z :. nrows :. ncols) $
                \ix ->
                  let (Z :. i :. j) = unlift ix
                   in getRow i a `dotP` getRow j b
    where
        b = A.transpose b' -- I assume row indexing is faster than column
indexing...
        (Z :. nrows :.   _  ) = unlift $ shape a
        (Z :.   _   :. ncols) = unlift $ shape b


This, of course, gives me errors right now because I'm calling getRow and
dotP from within the generation function, which expects Exp[ression]s, not
Acc[elerated computation]s.

So maybe I need to replace that line with an inner for loop? Is there an
easy way to do that with Accelerate?

Thanks for your help,
  - Clark

[1] http://hackage.haskell.org/package/accelerate
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20121203/302859d3/attachment.htm>


More information about the Haskell-Cafe mailing list