[Haskell-beginners] missing parallelism

haskell at erebe.eu haskell at erebe.eu
Sat Apr 25 10:25:25 UTC 2015


Again, does you program compute something if you remove the print ? no ? 
So you have a stricness/laziness problem.

Let's see together what does mean p :: IO Int
By its signature *p* assure you that *when evaluate* it will return you 
an *IO Int*.
By Its signature *IO Int* assure you that*when evaluated* it will return 
you an *Int*.
By Its signature *Int* assure you that *when evaluated* it will return 
you an *Int* but this time, a *strict one* as there is *nothing left to 
evaluate*

ok so now, what does mean *async p* ? Async will *force the evaluation* 
of the *IO* but that is all.
so : async p = async $ return $ sumEuler 5000 :: IO Int
so : a <- async p = a <- async $ return $ sumEuler 5000 :: IO Int -> Int
so : a = sumEuler 5000 :: Int

So *a* promise you an Int (but a lazy one) so where has it been 
evaluated ? Nowhere, the evaluation of *a *never happens anywhere. So 
you are missing a final step, the last evaluation of *a* (which does 
happen after, due to the print)

How to solve that ? Force the evaluation of sumEuler 5000 inside the IO 
monad, so it will be evaluated at the same time that the async will 
evaluate the IO. (In order to evaluate something you have to force a 
data depedency)

p :: IO int
p = return $! sumEuler 5000

or
p :: IO int
p = let x = sumEuler in x `seq` return x

This should solve your problem of everything running on a single 
thread/core, because now you force the evaluation of sumEuler inside p 
and not inside the print.

Now, be aware that as sumEuler is pure and that 5000 is a static value 
(also pure), sumEuler 5000 needs to be computed only once. The program 
is free to optimize away further calls to sumEuler 5000 by caching the 
value.
But at least you shoud see this evaluation happening on multiples cores.


Regards,
Romain

On 25/04/2015 03:40, Maurizio Vitale wrote:
> Not even with this simplified version, I can get two async running on
> separate threads.
> I must be missing something really basic:
>
> {-# LANGUAGE UnicodeSyntax #-}
> {-# LANGUAGE TupleSections #-}
>
> import Control.Concurrent.Async (async, waitBoth)
>
> sumEuler :: Int → Int
> sumEuler = sum . map euler . mkList
>             where
>               mkList n = [1..n-1]
>               euler n = length (filter (relprime n) [1..n-1])
>                 where
>                   relprime x y = gcd x y == 1
>
> p ∷ IO Int
> p = return $ sumEuler 5000
>
> main ∷ IO()
> main = do
>    a ← async p
>    b ← async p
>    (av, bv) ← waitBoth a b
>    print (av,bv)
>
> On Fri, Apr 24, 2015 at 7:06 PM, Maurizio Vitale <mrz.vtl at gmail.com> wrote:
>> You're right, without the show, no work is done.
>> But I'm puzzled. I thought waitAny would have caused one task to be executed.
>> If that doesn't wait for the async to compute a value (and not some
>> thunk) I don't see how to use asyncs, so I'm obviously missing
>> something.
>> How can I force deeper evaluation?
>> [in the end p would be a full parser, so whatever it is needed to
>> cause the parsing needs to be done outside of it]
>>
>> On Fri, Apr 24, 2015 at 10:44 AM,  <haskell at erebe.eu> wrote:
>>> On 24/04/2015 14:21, Maurizio Vitale wrote:
>>>> G'day,
>>>>     I have a test code that I compile with ghc -threaded -eventlog
>>>> -rtsopts --make parallel.hs and run with ./parallel 2 +RTS -ls -N4 on
>>>> a laptop with 4 physical cores. I would expect activities in two
>>>> threads, but threadscope shows only one active thread.
>>>> Can somebody explain me the program's behaviour?
>>>>
>>>> Thanks a lot
>>>>
>>>>      Maurizio
>>>>
>>>> {-# LANGUAGE UnicodeSyntax #-}
>>>> {-# LANGUAGE TupleSections #-}
>>>>
>>>> import Control.Applicative
>>>> import Control.Concurrent.Async (async, waitAny, Async)
>>>> import Data.List (delete, sortBy)
>>>> import Data.Ord (comparing)
>>>> import System.CPUTime
>>>> import System.Environment
>>>> import GHC.Conc (numCapabilities)
>>>>
>>>> concurrentlyLimited :: Int -> [IO a] -> IO [a]
>>>> concurrentlyLimited n tasks = concurrentlyLimited' n (zip [0..] tasks) []
>>>> []
>>>>
>>>> concurrentlyLimited' ∷ Int               -- ^ number of concurrent
>>>> evaluations
>>>>                        → [(Int, IO b)]     -- ^ still to run (ordered by
>>>> first element)
>>>>                        → [Async (Int,b)]   -- ^ currently running
>>>>                        → [(Int,b)]         -- ^ partial results (ordered
>>>> by first element)
>>>>                        → IO [b]
>>>> concurrentlyLimited' _ [] [] results = return . map snd $ sortBy
>>>> (comparing fst) results
>>>> concurrentlyLimited' 0 todo ongoing results = do
>>>>       (task, newResult) <- waitAny ongoing
>>>>       concurrentlyLimited' 1 todo (delete task ongoing) (newResult:results)
>>>>
>>>> concurrentlyLimited' _ [] ongoing results = concurrentlyLimited' 0 []
>>>> ongoing results
>>>> concurrentlyLimited' n ((i, task):otherTasks) ongoing results = do
>>>>       t <- async $ (i,) <$> task
>>>>       concurrentlyLimited' (n-1) otherTasks (t:ongoing) results
>>>>
>>>> euler :: Int → Int
>>>> euler n = length (filter (relprime n) [1..n-1])
>>>>             where
>>>>               relprime x y = gcd x y == 1
>>>>
>>>> sumEuler :: Int → Int
>>>> sumEuler = sum . (map euler) . mkList
>>>>              where
>>>>                mkList n = [1..n-1]
>>>>
>>>> p ∷ IO Int
>>>> p = return $ sumEuler 3000
>>>>
>>>> numThreads ∷ [String] → IO Int
>>>> numThreads [] = return numCapabilities
>>>> numThreads [cores] = return $ read cores
>>>>
>>>> main ∷ IO()
>>>> main = do
>>>>     threads ← getArgs >>= numThreads
>>>>     putStrLn $ "Running up to " ++ show threads ++ " threads in parallel
>>>> (on " ++ show numCapabilities ++ " cores)"
>>>>     startTime ← getCPUTime
>>>>     f ← concurrentlyLimited threads $ replicate 10 p
>>>>     endTime ← getCPUTime
>>>>     putStrLn $ foldr  ((++) . show ) ""  f
>>>>     putStrLn $ "Evaluation took " ++ show (fromIntegral (endTime -
>>>> startTime) / 1000000000000∷Double)
>>>> _______________________________________________
>>>> Beginners mailing list
>>>> Beginners at haskell.org
>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
>>>
>>> Hello,
>>>
>>> I don't have an Haskell environnement at hand, but my advice is to search
>>> for when your Async/eulerSum calls are evaluated.
>>> For example try to remove this line   -- putStrLn $ foldr  ((++) . show ) ""
>>> f
>>> Does your program still compute something ? If no that's because your sum is
>>> evaluated due to the show and not due to your async.
>>>
>>> t <- async $ (i,) <$> task
>>>
>>> Your async will try to compute (i,eulerSum) but you never force the
>>> computation of the eulersum inside the async, so the async take no time and
>>> return quickly.
>>> Instead of this type [Async (Int, b)] you should aim for this one [(Int,
>>> Async b)]
>>>
>>> Let me know if that helps you.
>>>
>>> Regards,
>>> Romain
>>>
>>> _______________________________________________
>>> Beginners mailing list
>>> Beginners at haskell.org
>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
>>
>> On Fri, Apr 24, 2015 at 1:44 PM,  <haskell at erebe.eu> wrote:
>>> On 24/04/2015 14:21, Maurizio Vitale wrote:
>>>> G'day,
>>>>     I have a test code that I compile with ghc -threaded -eventlog
>>>> -rtsopts --make parallel.hs and run with ./parallel 2 +RTS -ls -N4 on
>>>> a laptop with 4 physical cores. I would expect activities in two
>>>> threads, but threadscope shows only one active thread.
>>>> Can somebody explain me the program's behaviour?
>>>>
>>>> Thanks a lot
>>>>
>>>>      Maurizio
>>>>
>>>> {-# LANGUAGE UnicodeSyntax #-}
>>>> {-# LANGUAGE TupleSections #-}
>>>>
>>>> import Control.Applicative
>>>> import Control.Concurrent.Async (async, waitAny, Async)
>>>> import Data.List (delete, sortBy)
>>>> import Data.Ord (comparing)
>>>> import System.CPUTime
>>>> import System.Environment
>>>> import GHC.Conc (numCapabilities)
>>>>
>>>> concurrentlyLimited :: Int -> [IO a] -> IO [a]
>>>> concurrentlyLimited n tasks = concurrentlyLimited' n (zip [0..] tasks) []
>>>> []
>>>>
>>>> concurrentlyLimited' ∷ Int               -- ^ number of concurrent
>>>> evaluations
>>>>                        → [(Int, IO b)]     -- ^ still to run (ordered by
>>>> first element)
>>>>                        → [Async (Int,b)]   -- ^ currently running
>>>>                        → [(Int,b)]         -- ^ partial results (ordered
>>>> by first element)
>>>>                        → IO [b]
>>>> concurrentlyLimited' _ [] [] results = return . map snd $ sortBy
>>>> (comparing fst) results
>>>> concurrentlyLimited' 0 todo ongoing results = do
>>>>       (task, newResult) <- waitAny ongoing
>>>>       concurrentlyLimited' 1 todo (delete task ongoing) (newResult:results)
>>>>
>>>> concurrentlyLimited' _ [] ongoing results = concurrentlyLimited' 0 []
>>>> ongoing results
>>>> concurrentlyLimited' n ((i, task):otherTasks) ongoing results = do
>>>>       t <- async $ (i,) <$> task
>>>>       concurrentlyLimited' (n-1) otherTasks (t:ongoing) results
>>>>
>>>> euler :: Int → Int
>>>> euler n = length (filter (relprime n) [1..n-1])
>>>>             where
>>>>               relprime x y = gcd x y == 1
>>>>
>>>> sumEuler :: Int → Int
>>>> sumEuler = sum . (map euler) . mkList
>>>>              where
>>>>                mkList n = [1..n-1]
>>>>
>>>> p ∷ IO Int
>>>> p = return $ sumEuler 3000
>>>>
>>>> numThreads ∷ [String] → IO Int
>>>> numThreads [] = return numCapabilities
>>>> numThreads [cores] = return $ read cores
>>>>
>>>> main ∷ IO()
>>>> main = do
>>>>     threads ← getArgs >>= numThreads
>>>>     putStrLn $ "Running up to " ++ show threads ++ " threads in parallel
>>>> (on " ++ show numCapabilities ++ " cores)"
>>>>     startTime ← getCPUTime
>>>>     f ← concurrentlyLimited threads $ replicate 10 p
>>>>     endTime ← getCPUTime
>>>>     putStrLn $ foldr  ((++) . show ) ""  f
>>>>     putStrLn $ "Evaluation took " ++ show (fromIntegral (endTime -
>>>> startTime) / 1000000000000∷Double)
>>>> _______________________________________________
>>>> Beginners mailing list
>>>> Beginners at haskell.org
>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
>>>
>>> Hello,
>>>
>>> I don't have an Haskell environnement at hand, but my advice is to search
>>> for when your Async/eulerSum calls are evaluated.
>>> For example try to remove this line   -- putStrLn $ foldr  ((++) . show ) ""
>>> f
>>> Does your program still compute something ? If no that's because your sum is
>>> evaluated due to the show and not due to your async.
>>>
>>> t <- async $ (i,) <$> task
>>>
>>> Your async will try to compute (i,eulerSum) but you never force the
>>> computation of the eulersum inside the async, so the async take no time and
>>> return quickly.
>>> Instead of this type [Async (Int, b)] you should aim for this one [(Int,
>>> Async b)]
>>>
>>> Let me know if that helps you.
>>>
>>> Regards,
>>> Romain
>>>
>>> _______________________________________________
>>> Beginners mailing list
>>> Beginners at haskell.org
>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
> _______________________________________________
> Beginners mailing list
> Beginners at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/beginners/attachments/20150425/ef526631/attachment-0001.html>


More information about the Beginners mailing list