[Haskell-beginners] missing parallelism

Maurizio Vitale mrz.vtl at gmail.com
Sat Apr 25 14:45:44 UTC 2015


Thanks to all. A combination of $! and evaluating sumEuler for
different arguments in each async finally gave me the two threads.
I'm still a bit worried about analyzing this type of issues in a
larger program, but I love the fact that sumEuler 5000 was evaluated
only once.

On Sat, Apr 25, 2015 at 6:25 AM,  <haskell at erebe.eu> wrote:
> Again, does you program compute something if you remove the print ? no ? So
> you have a stricness/laziness problem.
>
> Let's see together what does mean p :: IO Int
> By its signature p assure you that when evaluate  it will return you an IO
> Int.
> By Its signature IO Int assure you that when evaluated it will return you an
> Int.
> By Its signature Int assure you that when evaluated it will return you an
> Int but this time, a strict one as there is nothing left to evaluate
>
> ok so now, what does mean async p ? Async will force the evaluation of the
> IO but that is all.
> so : async p = async $ return $ sumEuler 5000 :: IO Int
> so : a <- async p = a <- async $ return $ sumEuler 5000 :: IO Int -> Int
> so : a = sumEuler 5000 :: Int
>
> So a promise you an Int (but a lazy one) so where has it been evaluated ?
> Nowhere, the evaluation of a never happens anywhere. So you are missing a
> final step, the last evaluation of a (which does happen after, due to the
> print)
>
> How to solve that ? Force the evaluation of sumEuler 5000 inside the IO
> monad, so it will be evaluated at the same time that the async will evaluate
> the IO. (In order to evaluate something you have to force a data depedency)
>
> p :: IO int
> p = return $! sumEuler 5000
>
> or
> p :: IO int
> p = let x = sumEuler in x `seq` return x
>
> This should solve your problem of everything running on a single
> thread/core, because now you force the evaluation of sumEuler inside p and
> not inside the print.
>
> Now, be aware that as sumEuler is pure and that 5000 is a static value (also
> pure), sumEuler 5000 needs to be computed only once. The program is free to
> optimize away further calls to sumEuler 5000 by caching the value.
> But at least you shoud see this evaluation happening on multiples cores.
>
>
> Regards,
> Romain
>
>
> On 25/04/2015 03:40, Maurizio Vitale wrote:
>
> Not even with this simplified version, I can get two async running on
> separate threads.
> I must be missing something really basic:
>
> {-# LANGUAGE UnicodeSyntax #-}
> {-# LANGUAGE TupleSections #-}
>
> import Control.Concurrent.Async (async, waitBoth)
>
> sumEuler :: Int → Int
> sumEuler = sum . map euler . mkList
>            where
>              mkList n = [1..n-1]
>              euler n = length (filter (relprime n) [1..n-1])
>                where
>                  relprime x y = gcd x y == 1
>
> p ∷ IO Int
> p = return $ sumEuler 5000
>
> main ∷ IO()
> main = do
>   a ← async p
>   b ← async p
>   (av, bv) ← waitBoth a b
>   print (av,bv)
>
> On Fri, Apr 24, 2015 at 7:06 PM, Maurizio Vitale <mrz.vtl at gmail.com> wrote:
>
> You're right, without the show, no work is done.
> But I'm puzzled. I thought waitAny would have caused one task to be
> executed.
> If that doesn't wait for the async to compute a value (and not some
> thunk) I don't see how to use asyncs, so I'm obviously missing
> something.
> How can I force deeper evaluation?
> [in the end p would be a full parser, so whatever it is needed to
> cause the parsing needs to be done outside of it]
>
> On Fri, Apr 24, 2015 at 10:44 AM,  <haskell at erebe.eu> wrote:
>
> On 24/04/2015 14:21, Maurizio Vitale wrote:
>
> G'day,
>    I have a test code that I compile with ghc -threaded -eventlog
> -rtsopts --make parallel.hs and run with ./parallel 2 +RTS -ls -N4 on
> a laptop with 4 physical cores. I would expect activities in two
> threads, but threadscope shows only one active thread.
> Can somebody explain me the program's behaviour?
>
> Thanks a lot
>
>     Maurizio
>
> {-# LANGUAGE UnicodeSyntax #-}
> {-# LANGUAGE TupleSections #-}
>
> import Control.Applicative
> import Control.Concurrent.Async (async, waitAny, Async)
> import Data.List (delete, sortBy)
> import Data.Ord (comparing)
> import System.CPUTime
> import System.Environment
> import GHC.Conc (numCapabilities)
>
> concurrentlyLimited :: Int -> [IO a] -> IO [a]
> concurrentlyLimited n tasks = concurrentlyLimited' n (zip [0..] tasks) []
> []
>
> concurrentlyLimited' ∷ Int               -- ^ number of concurrent
> evaluations
>                       → [(Int, IO b)]     -- ^ still to run (ordered by
> first element)
>                       → [Async (Int,b)]   -- ^ currently running
>                       → [(Int,b)]         -- ^ partial results (ordered
> by first element)
>                       → IO [b]
> concurrentlyLimited' _ [] [] results = return . map snd $ sortBy
> (comparing fst) results
> concurrentlyLimited' 0 todo ongoing results = do
>      (task, newResult) <- waitAny ongoing
>      concurrentlyLimited' 1 todo (delete task ongoing) (newResult:results)
>
> concurrentlyLimited' _ [] ongoing results = concurrentlyLimited' 0 []
> ongoing results
> concurrentlyLimited' n ((i, task):otherTasks) ongoing results = do
>      t <- async $ (i,) <$> task
>      concurrentlyLimited' (n-1) otherTasks (t:ongoing) results
>
> euler :: Int → Int
> euler n = length (filter (relprime n) [1..n-1])
>            where
>              relprime x y = gcd x y == 1
>
> sumEuler :: Int → Int
> sumEuler = sum . (map euler) . mkList
>             where
>               mkList n = [1..n-1]
>
> p ∷ IO Int
> p = return $ sumEuler 3000
>
> numThreads ∷ [String] → IO Int
> numThreads [] = return numCapabilities
> numThreads [cores] = return $ read cores
>
> main ∷ IO()
> main = do
>    threads ← getArgs >>= numThreads
>    putStrLn $ "Running up to " ++ show threads ++ " threads in parallel
> (on " ++ show numCapabilities ++ " cores)"
>    startTime ← getCPUTime
>    f ← concurrentlyLimited threads $ replicate 10 p
>    endTime ← getCPUTime
>    putStrLn $ foldr  ((++) . show ) ""  f
>    putStrLn $ "Evaluation took " ++ show (fromIntegral (endTime -
> startTime) / 1000000000000∷Double)
> _______________________________________________
> Beginners mailing list
> Beginners at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
>
> Hello,
>
> I don't have an Haskell environnement at hand, but my advice is to search
> for when your Async/eulerSum calls are evaluated.
> For example try to remove this line   -- putStrLn $ foldr  ((++) . show ) ""
> f
> Does your program still compute something ? If no that's because your sum is
> evaluated due to the show and not due to your async.
>
> t <- async $ (i,) <$> task
>
> Your async will try to compute (i,eulerSum) but you never force the
> computation of the eulersum inside the async, so the async take no time and
> return quickly.
> Instead of this type [Async (Int, b)] you should aim for this one [(Int,
> Async b)]
>
> Let me know if that helps you.
>
> Regards,
> Romain
>
> _______________________________________________
> Beginners mailing list
> Beginners at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
>
> On Fri, Apr 24, 2015 at 1:44 PM,  <haskell at erebe.eu> wrote:
>
> On 24/04/2015 14:21, Maurizio Vitale wrote:
>
> G'day,
>    I have a test code that I compile with ghc -threaded -eventlog
> -rtsopts --make parallel.hs and run with ./parallel 2 +RTS -ls -N4 on
> a laptop with 4 physical cores. I would expect activities in two
> threads, but threadscope shows only one active thread.
> Can somebody explain me the program's behaviour?
>
> Thanks a lot
>
>     Maurizio
>
> {-# LANGUAGE UnicodeSyntax #-}
> {-# LANGUAGE TupleSections #-}
>
> import Control.Applicative
> import Control.Concurrent.Async (async, waitAny, Async)
> import Data.List (delete, sortBy)
> import Data.Ord (comparing)
> import System.CPUTime
> import System.Environment
> import GHC.Conc (numCapabilities)
>
> concurrentlyLimited :: Int -> [IO a] -> IO [a]
> concurrentlyLimited n tasks = concurrentlyLimited' n (zip [0..] tasks) []
> []
>
> concurrentlyLimited' ∷ Int               -- ^ number of concurrent
> evaluations
>                       → [(Int, IO b)]     -- ^ still to run (ordered by
> first element)
>                       → [Async (Int,b)]   -- ^ currently running
>                       → [(Int,b)]         -- ^ partial results (ordered
> by first element)
>                       → IO [b]
> concurrentlyLimited' _ [] [] results = return . map snd $ sortBy
> (comparing fst) results
> concurrentlyLimited' 0 todo ongoing results = do
>      (task, newResult) <- waitAny ongoing
>      concurrentlyLimited' 1 todo (delete task ongoing) (newResult:results)
>
> concurrentlyLimited' _ [] ongoing results = concurrentlyLimited' 0 []
> ongoing results
> concurrentlyLimited' n ((i, task):otherTasks) ongoing results = do
>      t <- async $ (i,) <$> task
>      concurrentlyLimited' (n-1) otherTasks (t:ongoing) results
>
> euler :: Int → Int
> euler n = length (filter (relprime n) [1..n-1])
>            where
>              relprime x y = gcd x y == 1
>
> sumEuler :: Int → Int
> sumEuler = sum . (map euler) . mkList
>             where
>               mkList n = [1..n-1]
>
> p ∷ IO Int
> p = return $ sumEuler 3000
>
> numThreads ∷ [String] → IO Int
> numThreads [] = return numCapabilities
> numThreads [cores] = return $ read cores
>
> main ∷ IO()
> main = do
>    threads ← getArgs >>= numThreads
>    putStrLn $ "Running up to " ++ show threads ++ " threads in parallel
> (on " ++ show numCapabilities ++ " cores)"
>    startTime ← getCPUTime
>    f ← concurrentlyLimited threads $ replicate 10 p
>    endTime ← getCPUTime
>    putStrLn $ foldr  ((++) . show ) ""  f
>    putStrLn $ "Evaluation took " ++ show (fromIntegral (endTime -
> startTime) / 1000000000000∷Double)
> _______________________________________________
> Beginners mailing list
> Beginners at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
>
> Hello,
>
> I don't have an Haskell environnement at hand, but my advice is to search
> for when your Async/eulerSum calls are evaluated.
> For example try to remove this line   -- putStrLn $ foldr  ((++) . show ) ""
> f
> Does your program still compute something ? If no that's because your sum is
> evaluated due to the show and not due to your async.
>
> t <- async $ (i,) <$> task
>
> Your async will try to compute (i,eulerSum) but you never force the
> computation of the eulersum inside the async, so the async take no time and
> return quickly.
> Instead of this type [Async (Int, b)] you should aim for this one [(Int,
> Async b)]
>
> Let me know if that helps you.
>
> Regards,
> Romain
>
> _______________________________________________
> Beginners mailing list
> Beginners at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
>
> _______________________________________________
> Beginners mailing list
> Beginners at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
>
>
>
> _______________________________________________
> Beginners mailing list
> Beginners at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
>


More information about the Beginners mailing list