Parallel Haskell: 2-year project to push real world use
Christian Höner zu Siederdissen
choener at tbi.univie.ac.at
Mon May 3 21:25:27 EDT 2010
* Roman Leshchinskiy <rl at cse.unsw.edu.au> [04.05.2010 02:32]:
> On 04/05/2010, at 09:21, Christian Höner zu Siederdissen wrote:
> > Hi,
> > on that topic, consider this (rather trivial) array:
> > a = array (1,10) [ (i,f i) | i <-[1..10]] where
> > f 1 = 1
> > f 2 = 1
> > f i = a!(i-1) + a!(i-2)
> > (aah, school ;)
> > Right now, I am abusing vector in ST by doing this:
> > a <- new
> > a' <- freeze a
> > forM_ [3..10] $ \i -> do
> > write a (a'!(i-1) + a!(i-2))
> > Let's say I wanted to do something like this in dph (or repa), does that
> > work? We are actually using this for RNA folding algorithms that are at
> > least O(n^3) time. For some of the more advanced stuff, it would be
> > really nice if we could "just" parallelize.
> Do you really just need a prefix sum? These are easily parallelisable if the operator is associative. For instance, you could implement the Fibonacci sequence as:
> mapP fst $ scanP (\(a,b) _ -> (a+b,a)) (1,0) $ replicateP n (0,0)
> and DPH would parallelise it. That's how I would write the above with vector as well.
That is, kind of, the fun part: you have
(1) a number of vectors whose values depend on each other (bad!)
(2) in-place update (bad!)
(3) rather trivial calculations for each element (mostly:
sum, minimum, fold, map, backpermute) (good!), we have simple semiring
> > To summarise: I need arrays that allow in-place updates.
> In-place updates + parallelism = bad! That's oversimplifying, of course. But the transformations underlying DPH, for instance, simply don't work in the presence of side effects.
The thing is, you can write the algorithm in a way such that each
operation on index "k" (whatever dimension k has) only requires access
to values <=k and those values will never change again. The problem is
that more than one vector is involved making it less fun to write code
like your fibonacci example.
> > Otherwise, most libraries that do heavy stuff (O(n^3) or worse) are
> > using vector right now. On a single core, it performs really great --
> > even compared to C-code that has been optimized a lot.
> That's great to know! Do you (or anyone else) by any chance have any benchmarks you could share? At the moment, I'm only benchmarking vector with a couple of rather simplistic algorithms which is a bit of a problem.
I can make my libraries available under GPLv3, they just need a bit of
love. This gives you a moderately complex algorithm for which there is,
too, a highly optimized C version (RNAfold -d2, in the vienna rna
I am giving a talk wednesday, after that I'll prepare the libraries for
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 198 bytes
Desc: not available
Url : http://www.haskell.org/pipermail/glasgow-haskell-users/attachments/20100503/4ef60779/attachment.bin
More information about the Glasgow-haskell-users