[Haskell-cafe] Naive matrix multiplication with Accelerate

Wed Dec 5 01:08:47 CET 2012

I don't mean to be blunt, but have you guys taken a course in linear
algebra?

On Mon, Dec 3, 2012 at 9:21 PM, Trevor L. McDonell <
tmcdonell at cse.unsw.edu.au> wrote:

> As far as I am aware, the only description is in the Repa paper. I you are
> right, it really should be explained properly somewhere…
>
> At a simpler example, here is the outer product of two vectors [1].
>
> vvProd :: (IsNum e, Elt e) => Acc (Vector e) -> Acc (Vector e) -> Acc
> (Matrix e)
> vvProd xs ys = A.zipWith (*) xsRepl ysRepl
>   where
>     n           = A.size xs
>     m           = A.size ys
>
>     xsRepl      = A.replicate (lift (Z :. All :. m  )) xs
>     ysRepl      = A.replicate (lift (Z :. n   :. All)) ys
>
> If we then `A.fold (+) 0` the matrix, it would reduce along each row
> producing a vector. So the first element of that vector is going to be
> calculated as (xs[0] * ys[0] + xs[0] * ys[1] +  … xs[0] * ys[m-1]). That's
> the idea we want for our matrix multiplication … but I agree, it is
> difficult for me to visualise as well.
>
> I do the same sort of trick with the n-body demo to get all n^2 particle
> interactions.
>
> -Trev
>
>
>  [1]: http://en.wikipedia.org/wiki/Outer_product#Vector_multiplication
>
>
>
> On 04/12/2012, at 3:41 AM, Clark Gaebel <cgaebel at uwaterloo.ca> wrote:
>
> Ah. I see now. Silly Haskell making inefficient algorithms hard to write
> and efficient ones easy. It's actually kind of annoying when learning, but
> probably for the best.
>
> Is there a good write-up of the algorithm you're using somewhere? The Repa
> paper was very brief in its explaination, and I'm having trouble
> visualizing the mapping of the 2D matricies into 3 dimensions.
>
>   - Clark
>
>
> On Mon, Dec 3, 2012 at 2:06 AM, Trevor L. McDonell <
> tmcdonell at cse.unsw.edu.au> wrote:
>
>> Hi Clark,
>>
>> The trick is that most accelerate operations work over multidimensional
>> arrays, so you can still get around the fact that we are limited to flat
>> data-parallelism only.
>>
>> Here is matrix multiplication in Accelerate, lifted from the first Repa
>> paper [1].
>>
>>
>> import Data.Array.Accelerate as A
>>
>> type Matrix a = Array DIM2 a
>>
>> matMul :: (IsNum e, Elt e) => Acc (Matrix e) -> Acc (Matrix e) -> Acc
>> (Matrix e)
>> matMul arr brr
>>   = A.fold (+) 0
>>   $ A.zipWith (*) arrRepl brrRepl
>>   where
>>     Z :. rowsA :. _     = unlift (shape arr)    :: Z :. Exp Int :. Exp Int
>>     Z :. _     :. colsB = unlift (shape brr)    :: Z :. Exp Int :. Exp Int
>>
>>     arrRepl             = A.replicate (lift $ Z :. All   :. colsB :. All)
>> arr
>>     brrRepl             = A.replicate (lift $ Z :. rowsA :. All   :. All)
>> (A.transpose brr)
>>
>>
>> If you use github sources rather than the hackage package, those
>> intermediate replicates will get fused away.
>>
>>
>> Cheers,
>> -Trev
>>
>>  [1] http://www.cse.unsw.edu.au/~chak/papers/KCLPL10.html
>>
>>
>>
>>
>> On 03/12/2012, at 5:07 PM, Clark Gaebel <cgaebel at uwaterloo.ca> wrote:
>>
>> Hello cafe,
>>
>> I've recently started learning about cuda and hetrogenous programming,
>> and have been using accelerate [1] to help me out. Right now, I'm running
>> into trouble in that I can't call parallel code from sequential code. Turns
>> out GPUs aren't exactly like Repa =P.
>>
>> Here's what I have so far:
>>
>> import qualified Data.Array.Accelerate as A
>> import Data.Array.Accelerate ( (:.)(..)
>>                              , Acc
>>                              , Vector
>>                              , Scalar
>>                              , Elt
>>                              , fold
>>                              , slice
>>                              , constant
>>                              , Array
>>                              , Z(..), DIM1, DIM2
>>                              , fromList
>>                              , All(..)
>>                              , generate
>>                              , lift, unlift
>>                              , shape
>>                              )
>> import Data.Array.Accelerate.Interpreter ( run )
>>
>> dotP :: (Num a, Elt a) => Acc (Vector a) -> Acc (Vector a) -> Acc (Scalar
>> a)
>> dotP xs ys = fold (+) 0 $ A.zipWith (*) xs ys
>>
>> type Matrix a = Array DIM2 a
>>
>> getRow :: Elt a => Int -> Acc (Matrix a) -> Acc (Vector a)
>> getRow n mat = slice mat . constant $ Z :. n :. All
>>
>> -- Naive matrix multiplication:
>> --
>> -- index (i, j) is equal to the ith row of 'a' `dot` the jth row of 'b'
>> matMul :: A.Acc (Matrix Double) -> A.Acc (Matrix Double) -> A.Acc (Matrix
>> Double)
>> matMul a b' = A.generate (constant $ Z :. nrows :. ncols) $
>>                 \ix ->
>>                   let (Z :. i :. j) = unlift ix
>>                    in getRow i a `dotP` getRow j b
>>     where
>>         b = A.transpose b' -- I assume row indexing is faster than column
>> indexing...
>>         (Z :. nrows :.   _  ) = unlift $ shape a
>>         (Z :.   _   :. ncols) = unlift $ shape b
>>
>>
>> This, of course, gives me errors right now because I'm calling getRow and
>> dotP from within the generation function, which expects Exp[ression]s, not
>> Acc[elerated computation]s.
>>
>> So maybe I need to replace that line with an inner for loop? Is there an
>> easy way to do that with Accelerate?
>>
>> Thanks for your help,
>>   - Clark
>>
>> [1] http://hackage.haskell.org/package/accelerate
>> _______________________________________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe at haskell.org
>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>>
>>
>>
>> _______________________________________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe at haskell.org
>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>>
>>
>
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20121204/1361a12b/attachment.htm>