[Haskell-cafe] Memoizing in parallel
Pierre-Etienne Meunier
pierreetienne.meunier at gmail.com
Sun Nov 8 09:20:52 EST 2009
In fact, I could simply write a fibonacci function like this :
fibonacci n=
fst $ fib n Map.empty
where
fib 0 m=(0,m)
fib 1 m=(1,m)
fib n m=
case Map.lookup n m of
Just x->(x,m)
Nothing->
let (n',m')=memoMap (n-2) m
(n'',m'')=memoMap (n-1) m'
in
(n'+n'',Map.insert n (n'+n'') m'')
And I could do it even with the implementation of maps in Data.IntMap.
I've tried your solution, and looked at your source code; There is a
small performance difference between your solution and this one :
since you rely on laziness, and my structure is really sparse (a
O(\sqrt(n)) is used in the best cases), it performs slightly better
than an array, but this is still too slow : the tree does not
rebalance itself, contrarily to what the trees in Data.IntMap do.
But my question is mainly a question about Haskell's parallelism :
imagine I'm solving the problem of attributing m pairs of skis to n
skiers, while trying to minimize the sum of the differences between
the size of the skiers and the size of their skis. With an array, I'd
write :
import GHC.Arr
import Control.Parallel
n=1000 --skiers
m=1500 --skis
skis=listArray (1,m) [150..]
skiers=listArray (1,n) [139..]
costs=listArray ((1,1), (n,m)) $ map cost $ range ((1,1),(n,m))
cost (i,j)
| j<i = 1/0
| i==1 = abs $ skis!j-skiers!i
| otherwise = let a=costs!(i,j-1)
b=(abs $ skis!j-skiers!i)+costs!(i-1,j-1)
in
(a`par`b)`seq`(min a b)
main=print $ costs!(n,m)
I'd first like to know if the difference in execution time that I can
measure when profiling really comes from thunks in the array being
executed in parallel, or from my imagination (the difference is quite
small with these numbers of skis and skiers).
Now imagine the same problem with at most 1% of the items used. If
your computer does not have infinite memory (or you don't want to
spend 90% allocating an array in GHC's runtime), how do you write
that, with the memoization table shared, of course ?
Pierre-Etienne
El 08-nov-09, a las 14:08, Luke Palmer escribió:
> On Sun, Nov 8, 2009 at 2:51 AM, Pierre-Etienne Meunier
> <pierreetienne.meunier at gmail.com> wrote:
>> Hi,
>>
>> I'm designing an algorithm that uses dynamic programming. I've
>> written it
>> with an array, and it works, but it is still very slow and needs
>> way too
>> much memory.
>>
>> Then I realized that the array was very sparse (at most a
>> O(\sqrt(n)) of its
>> size is actually used). Now I want to rewrite it with a Data.Map,
>> but since
>> I do not know a priori what the keys are, I need a mutable ref
>> somewhere.
>
> I don't know how you drew that conclusion.
>
> In fact, no mutable ref is necessary. Your keys are (or can be mapped
> to) integers, since you used an array. A solution is to use a trie of
> integers. You could, for example, store the values at the nodes of
> an infinite tree that looks like:
>
> 0
> 1 2
> 3 4 5 6
> 7 8 9 10 11 12 13 14
> ...
>
> There are various implementations of this around. For a quick
> solution, though, you can try the data-memocombinators package:
>
> import qualified Data.MemoCombinators as Memo
>
> let f = Memo.integral go
> where
> go = ... f ...
>
> See how that performs. It has asymptotically better space performance
> for sparse usage, but the devil can be in the constant factors.
>
> Luke
More information about the Haskell-Cafe
mailing list