[Haskell-cafe] Re: Parallelism and expensive calculations that may
not be needed
Simon Marlow
simonmarhaskell at gmail.com
Fri Jun 8 05:21:41 EDT 2007
Felipe Almeida Lessa wrote:
> (Sorry if this is a newbie question, couldn't find the answer anywhere)
>
> Suppose I have an expensive function (such that even to be reduced to WHNF
> it takes a long processing time)
>
>> expensive :: Foo -> Maybe Bar
>
> and I want to calculate it on multiple processors, like in
>
>> calculate = parMap rnf expensive
>
> Well, fine. But suppose that then I give the result to a function like
>
>> final xs = reverse $ final' xs [] where
>> final' [] r = r
>> final' (x:xs) r = case x of Nothing -> []
>> Just x -> final' xs (x:r)
>
> Given list :: [Foo], a sequential calculation as in
>
>> resultS = final $ map expensive list
>
> would not call expensive after finding a Nothing. But
>
>> resultP = final $ calculate list
>
> will keep calculating all the other expensive's because they were sparked?
Right. This is an example of speculation: you don't know in advance how much
work you really have to do, so any work you do in parallel may or may not be
necessary.
If we fire off the whole list in parallel, we might end up doing a lot of work
on the elements at the end of the list, which are less likely to be required
than the elements at the front of the list. That's a bad strategy: we need to
prioritise the elements near the front. Fortunately GHC's built-in strategy for
picking sparks to evaluate picks the oldest ones first, so this shouldn't be a
problem.
The problem occurs when you've found the Nothing, but the rest of the list has
already been sparked. You really want to throw away all those sparks, but
there's no way to do that currently. One way you could improve the situation
though is to only spark the next N elements in the list, limiting the amount of
speculation.
Hope this helps...
Cheers,
Simon
More information about the Haskell-Cafe
mailing list