[Haskell-cafe] testing par with simple program
Paolino
paolo.veronelli at gmail.com
Fri Aug 21 14:17:42 EDT 2009
A better test program
import Control.Parallel
main = a `par` b `pseq` print (a + b )
where
a = ack 3 11
b = fib 39
ack 0 n = n+1
ack m 0 = ack (m-1) 1
ack m n = ack (m-1) (ack m (n-1))
fib 0 = 0
fib 1 = 1
fib n = fib (n-1) + fib (n-2)
running it , these are the results
paolino at paolino-casa:~$ ghc --make prova -threaded
[1 of 1] Compiling Main ( prova.hs, prova.o )
Linking prova ...
paolino at paolino-casa:~$ time ./prova +RTS -N1
63262367
real 1m17.485s
user 1m16.473s
sys 0m0.392s
paolino at paolino-casa:~$ time ./prova +RTS -N2
63262367
real 1m20.186s
user 1m31.554s
sys 0m0.600s
paolino at paolino-casa:~$ touch prova.hs
paolino at paolino-casa:~$ ghc --make prova -O2 -threaded
[1 of 1] Compiling Main ( prova.hs, prova.o )
Linking prova ...
paolino at paolino-casa:~$ time ./prova +RTS -N1
63262367
real 0m17.652s
user 0m15.277s
sys 0m0.108s
paolino at paolino-casa:~$ time ./prova +RTS -N2
63262367
real 0m13.650s
user 0m15.121s
sys 0m0.188s
>From the resource graph and the timings it is clear that the program is not
able to use all the 2 cores powers, considering computing 'a' alone is about
7 seconds and 'b' alone 9.
What is retaining the cpu's to run full power ?
paolino
2009/8/21 Don Stewart <dons at galois.com>
> paolo.veronelli:
> > Hi, reading a previous thread I got interested.
> > I simplified the example pointed by dons in
> >
> > import Control.Parallel
> >
> > main = a `par` b `pseq` print (a + b )
> > where
> > a = ack 3 11
> > b = ack 3 11
> >
> > ack 0 n = n+1
> > ack m 0 = ack (m-1) 1
> > ack m n = ack (m-1) (ack m (n-1))
> >
> > compiled with
> > ghc --make prova -O2 -threaded
> >
> > timings
> > paolino at paolino-casa:~$ time ./prova +RTS -N1
> > 32762
> >
> > real 0m7.031s
> > user 0m6.304s
> > sys 0m0.004s
> > paolino at paolino-casa:~$ time ./prova +RTS -N2
> > 32762
> >
> > real 0m6.997s
> > user 0m6.728s
> > sys 0m0.020s
> > paolino at paolino-casa:~$
> >
> > without optimizations it gets worse
> >
> > paolino at paolino-casa:~$ time ./prova +RTS -N1
> > 32762
> >
> > real 1m20.706s
> > user 1m18.197s
> > sys 0m0.104s
> > paolino at paolino-casa:~$ time ./prova +RTS -N2
> > 32762
> >
> > real 1m38.927s
> > user 1m45.039s
> > sys 0m0.536s
> > paolino at paolino-casa:~$
> >
> > staring at the resource usage graph I can see it does use 2 cores when
> told to
> > do it, but with -N1 the used cpu goes 100% and with -N2 they both run
> just over
> > 50%
> >
> > thanks for comments
>
>
> Firstly, a and b are identical, so GHC commons them up. The compiler
> transforms it into:
>
> a `par` a `seq` print (a + a)
>
> So you essentially fork a spark to evaluate 'a', and then have the main
> thread also evaluate 'a' again. One of them wins, then you add the
> result to itself. The runtime may choose not to convert your first spark
> into a thread.
>
> Running with a 2009 GHC head snapshot, we can see with +RTS -sstderr
>
> SPARKS: 1 (0 converted, 0 pruned)
>
> That indeed, it doesn't convert your `par` into a real thread.
>
> While, for example, the helloworld on the wiki:
>
> http://haskell.org/haskellwiki/Haskell_in_5_steps
>
> Converts 2 sparks to 2 theads:
>
> SPARKS: 2 (2 converted, 0 pruned)
> ./B +RTS -threaded -N2 -sstderr 2.13s user 0.04s system 137% cpu 1.570
> total
>
> -- Don
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/haskell-cafe/attachments/20090821/f92a3f18/attachment.html
More information about the Haskell-Cafe
mailing list