[Haskell-cafe] Generate random UArray in constant memory space.

Daniel Fischer daniel.is.fischer at web.de
Tue Feb 9 10:30:45 EST 2010


Am Dienstag 09 Februar 2010 14:51:31 schrieb Vasyl Pasternak:
> Sorry, maybe I should ask more clearer.
>
> I've looked at dons article "Haskell as fast as C"[1], and tried to
> implement similar algorithm but for list of random numbers.
>
> Please look at code:
> > import Text.Printf
> > import Control.Applicative
> > import System.Environment
> > import Data.Array.Vector
> >
> > main = do
> >   [size] <- map read <$> getArgs
> >   let ints = enumFromToU 0 size :: UArr Int
> >   printf "%d\n"  (sumU ints)
>
> This code runs in constant space (on my pc ~25kb allocates on the
> heap) regardless of array size.

The random number thing runs in constant space, too.
The difference is that the enumFromToU produces a tight loop with variables 
which never leave the registers, while in the random number thing at least 
the StdGens are allocated in the heap (the produced Ints may stay in the 
registers, too, I don't know).
But enumFromToU and friends were written for that to happen, there are a 
ton of rewrite rules to help the compiler create tight loops. StdGen has no 
such thing, randomR has no {-# INLINE #-} pragma, so you have a call to 
randomR (well, to randomIValInteger, actually) for each element.
Maybe if you had the source for the PRNG in the same file, it could be 
inlined to give an allocation-free loop.

> So I tried to achieve similar with
> random list, just to replace `enumFromToU` with my own list generator.
>
> So the question - is it possible to implement random list similary to
> enumFromToU?
>
>
> [1]http://donsbot.wordpress.com/2008/06/04/haskell-as-fast-as-c-working-
>at-a-high-altitude-for-low-level-performance/
>
> Thank you,
> Vasyl Pasternak
>
> 2010/2/9 Daniel Fischer <daniel.is.fischer at web.de>:
> > Am Dienstag 09 Februar 2010 13:18:23 schrieb Vasyl Pasternak:
> >> Hello Cafe,
> >>
> >> I tried to generate memory-efficient list of random numbers, so I've
> >> used uvector library for this task. But it doesn't work,
> >> it still allocates more than 6Gb of memory for the random list of 10
> >>
> >> million elements. Here is the code:
> >
> > Hmm,
> >
> > $ ghc -O2 --make ranVec
> > [1 of 1] Compiling Main             ( ranVec.hs, ranVec.o )
> > Linking ranVec ...
> > $ ./ranVec 10000000 +RTS -sstderr
> > 5130
> >   4,919,912,080 bytes allocated in the heap
> >         883,256 bytes copied during GC
> >          26,896 bytes maximum residency (1 sample(s))
> >          25,620 bytes maximum slop
> >               1 MB total memory in use (0 MB lost due to
> > fragmentation)
> >
> > maximum residency is just eight bytes more than for 100,000 or
> > 1,000,000 numbers. I think that is constant space.
> >
> > The ~5 GB total allocation is sequential, ten million new StdGens are
> > produced and allocated, then immediately garbage collected. I see no
> > problem (except that StdGen is slow, e.g. the Mersenne twister is much
> > faster [and allocates less, but still linear in size]).
> >
> >> > import Text.Printf
> >> > import System.Random
> >> > import Control.Applicative
> >> > import System.Environment
> >> > import Data.Array.Vector
> >> >
> >> > randomListU :: (Int, Int) -> StdGen -> Int -> (UArr Int)
> >> > randomListU b g size = unfoldU size gen g
> >> >   where gen g = let (x, g') = randomR b g
> >> >                 in JustS (x :*: g')
> >> >
> >> > main = do
> >> >   [size] <- map read <$> getArgs
> >> >   let ints = randomListU (-10, 10) (mkStdGen 1) size
> >> >   printf "%d\n"  (sumU ints)
> >>
> >> Could someone give a hint, how to implement this function in constant
> >> memory space?
> >>
> >> Thank you in advance.
> >>
> >> Best regards,
> >> Vasyl Pasternak



More information about the Haskell-Cafe mailing list