[Haskell-beginners] missing parallelism

Grzegorz Milka grzegorzmilka at gmail.com
Sat Apr 25 08:25:29 UTC 2015


As far as I can tell there's nothing wrong with your code. My hypothesis
is that Haskell optimizes call to sumEuler 5000 by calling it only once
in one thread. Here's why I think so:

The program I used for debugging this is:

    import Control.Concurrent.Async (async, wait)
    import System.IO.Unsafe (unsafePerformIO)

    sumEuler :: Int -> Int
    sumEuler =  sum . map euler . mkList
               where
                 mkList n = seq (unsafePerformIO (putStrLn
    "Calculating")) [1..n-1]
                 euler n = length (filter (relprime n) [1..n-1])
                   where
                     relprime x y = gcd x y == 1

    p :: Int -> IO Int
    p b = return $! sumEuler b

    p5000 :: IO Int
    p5000 = return $ sumEuler 5000

    main :: IO ()
    main = do
      a <- async $ p5000
      b <- async $ p5000
      av <- wait a
      bv <- wait b
      print (av,bv)

The two main modification are adding a debugging message "Calculating"
when a list [1..(n-1)] is evaluated. The second one is making p into a
function. Notice that it uses strict application ($! - check how it
works with simple $). I will use that function in further examples.

Running this program with "time ./Test 2 +RTS -ls -N2" gives me:

    Calculating
    (7598457,7598457)

    real    0m3.752s
    user    0m3.833s
    sys    0m0.211s

Just to be sure I have almost the same time when doing only one
computation with:

    main :: IO ()
    main = do
      a <- async $ p5000
      av <- wait a
      print av

So it seems like the value returned by p5000 is computed only once. GHC
might be noticing that p5000 will always return the same value and might
try cache it or memoize it. If this hypothesis is right then calling
sumEuler with two different values should run in two different threads.
And indeed it is so:

    main :: IO ()
    main = do
      a <- async $ p 5000
      b <- async $ p 4999
      av <- wait a
      bv <- wait b
      print (av,bv)

Gives:

    Calculating
    Calculating
    (7598457,7593459)

    real    0m3.758s
    user    0m7.414s
    sys    0m0.064s

So it runs in two threads just as expected. The strict application ($!)
here is important. Otherwise it seems that the async thread returns a
thunk and the evaluation happens in print (av, bv) which is evaluated in
a single thread.  Also the fact that p5000 is a top level binding is
important. When I do:

    main :: IO ()
    main = do
      a <- async $ p 5000
      b <- async $ p 5000
      av <- wait a
      bv <- wait b
      print (av,bv)

I get no optimization (GHC 7.6.3).

Best,
Greg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/beginners/attachments/20150425/7c2293b6/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.haskell.org/pipermail/beginners/attachments/20150425/7c2293b6/attachment.sig>


More information about the Beginners mailing list