Bug in the scaling of randoms ...
Ronny Wichers Schreur
ronny@cs.kun.nl
Tue, 07 May 2002 10:28:45 +0200
Dimitre Novatchev writes (to the Haskell mailing list):
> In his excellent book Simon Thompson defines scaling of the elements of
> a sequence of random numbers from the interval [0, 65535] to an
> interval [s,t] in the following way (top of page 368):
> > scaleSequence :: Int -> Int -> [Int] -> [Int]
> > scaleSequence s t
> > = map scale
> > where
> > scale n = n `div` denom + s
> > range = t - s + 1
> > denom = modulus `div` range
> > where modulus = 65536
> However, the following expression:
>
> e4000 = (scaleSequence 1 966 ([65231] ++ [1..965]))!!0
> evaluates to 974 -- clearly outside of the specified interval.
The function fails in this case because the range doesn't divide the
modulus. In the book (I checked the Miranda version), there's a remark
after the function that "[The] range of numbers 0 to modulus-1 is split
into range blocks, each of the same length."
When the range doesn't divide the modulus, it's not possible to get
a uniform distribution if you scale each value individually. I assume
this is acceptable, otherwise you'll have to discard or combine values
from the original sequence.
Oleg answers:
>We aim to find a function sc(n) defined over integers 0 <= n <= M so
>that
> sc(0) -> s (given integral number)
> sc(M) -> t (given integral number), t>=s
>and
> 0<=a < b ==> sc(a) <= sc(b)
>
>Such a function sc(n) is given by the following expression:
> sc(n) = (n*(t-s)) `div` M + s
(M = modulus - 1)
I think a more natural function would be
sc(n) = (n * (t-s+1)) `div` modulus + s
This satifies the conditions (assuming range = t-s+1 <= modulus), and
distributes the values more evenly, in particular when the range divides
the modulus.
Cheers,
Ronny Wichers Schreur