[Haskell-cafe] Naive matrix multiplication with Accelerate

Mon Dec 3 08:06:05 CET 2012

Hi Clark,

The trick is that most accelerate operations work over multidimensional arrays, so you can still get around the fact that we are limited to flat data-parallelism only.

Here is matrix multiplication in Accelerate, lifted from the first Repa paper [1].

import Data.Array.Accelerate as A

type Matrix a = Array DIM2 a

matMul :: (IsNum e, Elt e) => Acc (Matrix e) -> Acc (Matrix e) -> Acc (Matrix e)
matMul arr brr
  = A.fold (+) 0
  $ A.zipWith (*) arrRepl brrRepl
  where
    Z :. rowsA :. _     = unlift (shape arr)    :: Z :. Exp Int :. Exp Int
    Z :. _     :. colsB = unlift (shape brr)    :: Z :. Exp Int :. Exp Int

    arrRepl             = A.replicate (lift $ Z :. All   :. colsB :. All) arr
    brrRepl             = A.replicate (lift $ Z :. rowsA :. All   :. All) (A.transpose brr)

If you use github sources rather than the hackage package, those intermediate replicates will get fused away.

Cheers,
-Trev

 [1] http://www.cse.unsw.edu.au/~chak/papers/KCLPL10.html

On 03/12/2012, at 5:07 PM, Clark Gaebel <cgaebel at uwaterloo.ca> wrote:

> Hello cafe,
> 
> I've recently started learning about cuda and hetrogenous programming, and have been using accelerate [1] to help me out. Right now, I'm running into trouble in that I can't call parallel code from sequential code. Turns out GPUs aren't exactly like Repa =P.
> 
> Here's what I have so far:
> 
> import qualified Data.Array.Accelerate as A
> import Data.Array.Accelerate ( (:.)(..)
>                              , Acc
>                              , Vector
>                              , Scalar
>                              , Elt
>                              , fold
>                              , slice
>                              , constant
>                              , Array
>                              , Z(..), DIM1, DIM2
>                              , fromList
>                              , All(..)
>                              , generate
>                              , lift, unlift
>                              , shape
>                              )
> import Data.Array.Accelerate.Interpreter ( run )
> 
> dotP :: (Num a, Elt a) => Acc (Vector a) -> Acc (Vector a) -> Acc (Scalar a)
> dotP xs ys = fold (+) 0 $ A.zipWith (*) xs ys
> 
> type Matrix a = Array DIM2 a
> 
> getRow :: Elt a => Int -> Acc (Matrix a) -> Acc (Vector a)
> getRow n mat = slice mat . constant $ Z :. n :. All
> 
> -- Naive matrix multiplication:
> --
> -- index (i, j) is equal to the ith row of 'a' `dot` the jth row of 'b'
> matMul :: A.Acc (Matrix Double) -> A.Acc (Matrix Double) -> A.Acc (Matrix Double)
> matMul a b' = A.generate (constant $ Z :. nrows :. ncols) $
>                 \ix ->
>                   let (Z :. i :. j) = unlift ix
>                    in getRow i a `dotP` getRow j b
>     where
>         b = A.transpose b' -- I assume row indexing is faster than column indexing...
>         (Z :. nrows :.   _  ) = unlift $ shape a
>         (Z :.   _   :. ncols) = unlift $ shape b
> 
> 
> This, of course, gives me errors right now because I'm calling getRow and dotP from within the generation function, which expects Exp[ression]s, not Acc[elerated computation]s.
> 
> So maybe I need to replace that line with an inner for loop? Is there an easy way to do that with Accelerate?
> 
> Thanks for your help,
>   - Clark
> 
> [1] http://hackage.haskell.org/package/accelerate
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20121203/182de299/attachment-0001.htm>