[Haskell-cafe] Waiting for garbage collection can kill parallelism?
Janek S.
fremenzone at poczta.onet.pl
Fri Nov 9 16:52:37 CET 2012
Today I was reading "Parallel Performance Tuning for Haskell" by Jones, Marlow and Singh and
wanted to replicate the results for their first case study. The code goes like this:
module Main where
import Control.Parallel
main :: IO ()
main = print . parSumFibEuler 38 $ 5300
parSumFibEuler :: Int -> Int -> Int
parSumFibEuler a b = f `par` (e `pseq` (e + f))
where f = fib a
e = sumEuler b
fib :: Int -> Int
fib 0 = 0
fib 1 = 1
fib n = fib (n - 1) + fib (n - 2)
mkList :: Int -> [Int]
mkList n = [1..n-1]
relprime :: Int -> Int -> Bool
relprime x y = gcd x y == 1
euler :: Int -> Int
euler n = length (filter (relprime n) (mkList n))
sumEuler :: Int -> Int
sumEuler = sum . (map euler) . mkList
sumFibEuler :: Int -> Int -> Int
sumFibEuler a b = fib a + sumEuler b
This is the version shown on page 3 of the paper, after adding the pseq combinator to enforce
correct evaluation order. I compile and run it with:
ghc -O2 -rtsopts -threaded -eventlog parallel.hs
./parallel +RTS -s -ls -N2
In the paper authors show that this version does in fact perform computation in parallel and that
good speedup is achieved. However, when I run the code what happens is that HEC 1 blocks very
quickly requesting GC. HEC 0 (if I am correct the one calculating sumEuler) does not interrupt
but instead continues the computations until they are finished. Then the GC is performed and the
HEC 1 resumes computation. In this way there is no parallelism, because first HEC 0 does all its
computations and after first GC HEC 1 does its computation.
My question is why this might be happening? I don't expect the results of the paper to be fully
reproducible, because the paper is 3 years old and GHC has developed a lot since then. This
however looks like a regression of some sort. I would appreciate if anyone could explain why
this.
Janek
More information about the Haskell-Cafe
mailing list