[Haskell-beginners] missing parallelism
Grzegorz Milka
grzegorzmilka at gmail.com
Sat Apr 25 08:25:29 UTC 2015
As far as I can tell there's nothing wrong with your code. My hypothesis
is that Haskell optimizes call to sumEuler 5000 by calling it only once
in one thread. Here's why I think so:
The program I used for debugging this is:
import Control.Concurrent.Async (async, wait)
import System.IO.Unsafe (unsafePerformIO)
sumEuler :: Int -> Int
sumEuler = sum . map euler . mkList
where
mkList n = seq (unsafePerformIO (putStrLn
"Calculating")) [1..n-1]
euler n = length (filter (relprime n) [1..n-1])
where
relprime x y = gcd x y == 1
p :: Int -> IO Int
p b = return $! sumEuler b
p5000 :: IO Int
p5000 = return $ sumEuler 5000
main :: IO ()
main = do
a <- async $ p5000
b <- async $ p5000
av <- wait a
bv <- wait b
print (av,bv)
The two main modification are adding a debugging message "Calculating"
when a list [1..(n-1)] is evaluated. The second one is making p into a
function. Notice that it uses strict application ($! - check how it
works with simple $). I will use that function in further examples.
Running this program with "time ./Test 2 +RTS -ls -N2" gives me:
Calculating
(7598457,7598457)
real 0m3.752s
user 0m3.833s
sys 0m0.211s
Just to be sure I have almost the same time when doing only one
computation with:
main :: IO ()
main = do
a <- async $ p5000
av <- wait a
print av
So it seems like the value returned by p5000 is computed only once. GHC
might be noticing that p5000 will always return the same value and might
try cache it or memoize it. If this hypothesis is right then calling
sumEuler with two different values should run in two different threads.
And indeed it is so:
main :: IO ()
main = do
a <- async $ p 5000
b <- async $ p 4999
av <- wait a
bv <- wait b
print (av,bv)
Gives:
Calculating
Calculating
(7598457,7593459)
real 0m3.758s
user 0m7.414s
sys 0m0.064s
So it runs in two threads just as expected. The strict application ($!)
here is important. Otherwise it seems that the async thread returns a
thunk and the evaluation happens in print (av, bv) which is evaluated in
a single thread. Also the fact that p5000 is a top level binding is
important. When I do:
main :: IO ()
main = do
a <- async $ p 5000
b <- async $ p 5000
av <- wait a
bv <- wait b
print (av,bv)
I get no optimization (GHC 7.6.3).
Best,
Greg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/beginners/attachments/20150425/7c2293b6/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.haskell.org/pipermail/beginners/attachments/20150425/7c2293b6/attachment.sig>
More information about the Beginners
mailing list