[Haskell-beginners] PI calculation - Newbie question
Daniel Fischer
daniel.is.fischer at web.de
Fri Jan 29 07:13:45 EST 2010
Am Freitag 29 Januar 2010 11:59:30 schrieb Gabi:
> Hi Group,
>
> I am just trying to learn the lang and Implemented this PI calculator.
> It is really slow and very memory consuming (much much slower than its
> equivalent in Clojure for instance)
>
> I think the problem is in "rs <- sequence (replicate n isRandIn)" -
> But I don't know how to get around it (how do I get a lazy sequence of
> rs? Is it the problem anyway?)
It's one part of the problem.
sequence (replicate n isRandIn)
requires all n random number computations to be in memory at once. When you
want far more than 10000 random points, that'll take a lot of memory.
The more evil part is that the things aren't actually calculated when you
call sequence, so what you get is a long list of thunks "calculate when
needed" which take a really large amount of space (
./MCPi 100000 +RTS -sstderr
3.14048
252,873,148 bytes allocated in the heap
48,792,700 bytes copied during GC
8,861,192 bytes maximum residency (5 sample(s))
128,768 bytes maximum slop
23 MB total memory in use (0 MB lost due to fragmentation)
./MCPi 1000000 +RTS -sstderr
3.1401
2,523,687,224 bytes allocated in the heap
527,553,588 bytes copied during GC
107,345,604 bytes maximum residency (8 sample(s))
7,619,396 bytes maximum slop
219 MB total memory in use (2 MB lost due to fragmentation)
).
You can reduce the space requirements by forcing the calculations earlier,
e.g.
isRandIn = do
p <- randPoint
return $! inCirc p
The ($!) requires the evaluation of (inCirc p) [to the outermost
constructor, since it's an Int, that means complete evaluation], the list
that sequence produces is a list of evaluated Ints, needs far less space.
Unfortunately, that is much slower. I'm not sure why.
However, it's not good to do that much in IO (and getStdRandom isn't
exactly fast, it has to retrieve the random generator and store the
modified generator).
The bulk of the algorithm is a pure calculation, given the *lazy* list of
pseudo-random coordinates.
So
----------------------------------------------------------------------
-- curried version of inCirc, no need to construct pairs
inCirc :: Double -> Double -> Int
inCirc x y
| dx*dx + dy*dy < 0.25 = 1
| otherwise = 0
where
dx = x - 0.5
dy = y - 0.5
-- transform a list of coordinates into a list of indicators
-- whether the point is inside the circle (sorry for the
-- stupid name)
inCircles :: [Double] -> [Int]
inCircles (x:y:zs) = inCirc x y : inCircles zs
inCircles _ = []
-- given a count of experiments and an infinite list of coordinates,
-- calculate an approximation to pi
calcPi :: Int -> [Double] -> Double
calcPi n ds = fromIntegral ct / fromIntegral n * 4
where
ct = sum . take n $ inCircles ds
-- now the IO part is only
-- * getting the number of experiments and
-- * getting the StdGen
main :: IO ()
main = do
args <- getArgs
sg <- getStdGen
let n = case args of
(a:_) -> read a
_ -> 10000
print $ calcPi n (randomRs (0,1) sg)
----------------------------------------------------------------------
Now the list of coordinates is consumed as it is produced, things can
immediately be garbage-collected (if compiled with optimisations, without
optimisations, you would have to replace "sum" with "foldl' (+) 0").
It runs in constant space and faster. It's still slow, though, because
System.Random's StdGen is slow.
For faster generation of pseudo-random numbers, take a look at e.g.
mersenne-random (there are other fast PRNG packages on hackage, too, IIRC).
>
> -- p.hs simple PI calculator, using the Monte Carlo Method
>
> import System( getArgs )
> import System.Random
> inCirc :: (Double, Double) -> Int
> inCirc (x,y) = if ((dx * dx) + (dy * dy)) < 0.25
> then 1
> else 0
> where dx = x - 0.5
> dy = y - 0.5
>
>
> randPoint :: IO (Double, Double)
> randPoint = do
> x <-getStdRandom (randomR (0, 1 :: Double))
> y <-getStdRandom (randomR (0, 1 :: Double))
> return (x, y)
>
>
> isRandIn = do
> p <- randPoint
> return (inCirc p)
>
>
> main = do
> args <- getArgs
> let n = if null args
> then 10000
> else read $ (head args)::Int
>
> rs <- sequence (replicate n isRandIn)
> let pi = (fromIntegral(sum rs) / fromIntegral n) * 4
> print pi
More information about the Beginners
mailing list