Bug in the scaling of randoms ...
oleg@pobox.com
oleg@pobox.com
Mon, 6 May 2002 11:12:32 -0700 (PDT)
Dimitre Novatchev wrote on May 4, 2002:
> In his excellent book Simon Thompson defines scaling of the elements of
> a sequence of random numbers from the interval [0, 65535] to an
> interval [s,t] in the following way (top of page 368):
> > scaleSequence :: Int -> Int -> [Int] -> [Int]
> > scaleSequence s t
> > = map scale
> > where
> > scale n = n `div` denom + s
> > range = t - s + 1
> > denom = modulus `div` range
> > where modulus = 65536
> However, the following expression:
>
> e4000 = (scaleSequence 1 966 ([65231] ++ [1..965]))!!0
> evaluates to 974 -- clearly outside of the specified interval.
Here's the algorithm for mapping of ranges of integers that is provably
correct.
We aim to find a function sc(n) defined over integers 0 <= n <= M so
that
sc(0) -> s (given integral number)
sc(M) -> t (given integral number), t>=s
and
0<=a < b ==> sc(a) <= sc(b)
Such a function sc(n) is given by the following expression:
sc(n) = (n*(t-s)) `div` M + s
Proof:
sc(0) = s
sc(M) = (M*(t-s)) `div` M + s = t
a<b ==> a*(t-s) < b*(t-s) ==> (a*(t-s)) `div` M <= (b*(t-s)) `div` M
The assertion that 0<=a<b ==> a `div` M <= b `div` M for M >0 can
easily be proven by contradiction. Indeed, given the facts
(a `div` M)* M = a - ra, 0<= ra <= M-1
(b `div` M)* M = b - rb, 0<= rb <= M-1
and an assumption
a `div` M > b `div` M
we see
a = M*(a `div` M) + ra
= M*(b `div` M + x) + ra { where x >= 1 }
= b - rb + ra + M*x
>= min{ b - rb + ra + M*x over ra in [0,M-1], rb in [0,M-1]}
>= b - (M-1) + M*x
>= b + 1 + M*(x-1) { x >= 1 }
> b {contradiction}
Thus the correct algorithm reads
> scaleSequence :: Int -> Int -> [Int] -> [Int]
> scaleSequence s t
> = map scale
> where
> scale n = (n*range) `div` maxn + s
> range = t - s
> maxn = modulus - 1
> modulus = 65536
Given your example,
Main> any (==966) (scaleSequence 1 966 ([65231] ++ [1..965] ++[65565]))
True
Main> any (>966) (scaleSequence 1 966 ([65231] ++ [1..965] ++[65565]))
False