[Haskell-cafe] Performance with do notation, mwc-random and unboxed vector
Dmitry Dzhus
dima at dzhus.org
Mon Jun 11 13:25:43 CEST 2012
11.06.2012, 14:17, "Malcolm Wallace" <malcolm.wallace at me.com>:
> that there are no side-effects
There are — PRNG state is updated for RealWorld, that's why monadic replicateM is used.
You can add something like
print $ (VU.!) e 500000
after e is bound and still get 0.057 sec with do-less version.
This quite matches the performance claimed by mwc-random package
and seems reasonable since modern hardware shouldn't have any problem
with generating twenty million random variates in a second with one execution thread.
Your note on laziness would be correct in case like
------ 8< ------
import qualified Data.Vector.Unboxed as VU
import Data.Functor
import System.Random.MWC
import System.Random.MWC.Distributions (standard)
count = 100000000
main = do
g <- create
e <- return $ VU.replicate count (212.8506 :: Double)
return ()
------ >8 -------
Where unused `e` is truly left unevaluated (you could force it
by matching with `!e` for example).
Profiling indicates that random number sampling really occurs for
both of original versions with `replicateM`, expectedly taking most of time:
Mon Jun 11 14:24 2012 Time and Allocation Profiling Report (Final)
slow-mwc-vector +RTS -p -RTS
total time = 5.45 secs (5453 ticks @ 1000 us, 1 processor)
total alloc = 3,568,827,856 bytes (excludes profiling overheads)
COST CENTRE MODULE %time %alloc
uniform2 System.Random.MWC 45.0 53.7
uniformWord32 System.Random.MWC 31.3 31.5
standard.loop System.Random.MWC.Distributions 4.1 1.1
uniform1 System.Random.MWC 3.9 4.5
nextIndex System.Random.MWC 3.6 1.4
uniform System.Random.MWC 2.8 3.3
uniform System.Random.MWC 2.5 1.4
wordsToDouble System.Random.MWC 2.1 0.5
I could drop do notation and go with the simpler version if I wanted just
a vector of variates. But in reality I want a vector of tuples with random
components:
------ 8< ------
import qualified Data.Vector.Unboxed as VU
import Control.Monad
import System.Random.MWC
import System.Random.MWC.Distributions (standard)
count = 1000000
main = do
g <- create
e <- VU.replicateM count $ do
v1 <- standard g
v2 <- standard g
v3 <- standard g
return (v1, v2, v3)
return ()
------ >8 -------
which runs for the same 11.412 seconds.
Since three times more variates are generated and run time stays the same,
this implies that perhaps some optimizations of vector package interfere
with mwc-random — can this be the case?
This becomes quite a bottleneck in my application.
On the other hand, mwc-random has `normal` function implemented as follows:
------ 8< ------
normal m s gen = do
x <- standard gen
return $! m + s * x
------ >8 -------
which again uses explicit `do`. Both standard and normal are marked with INLINE.
Now if I try to write
------ 8< ------
e <- VU.replicateM count $ normal 0 1 g
------ >8 -------
in my test case, quite expectedly I get horrible performance of 11 seconds,
even though I'm not using do myself.
More information about the Haskell-Cafe
mailing list