Bug in the scaling of randoms ...

oleg@pobox.com oleg@pobox.com
Mon, 6 May 2002 11:12:32 -0700 (PDT)


Dimitre Novatchev wrote on May 4, 2002:
> In his excellent book Simon Thompson defines scaling of the elements of
> a sequence of random numbers from the interval [0, 65535] to an
> interval [s,t] in the following way (top of page 368):

> > scaleSequence :: Int -> Int -> [Int] -> [Int]
> > scaleSequence s t
> >   = map scale
> >     where
> >       scale n = n `div` denom + s
> >       range   = t - s + 1
> >       denom   = modulus `div` range
> > where modulus = 65536
> However, the following expression:
>
> e4000 = (scaleSequence 1 966 ([65231] ++ [1..965]))!!0
> evaluates to 974 -- clearly outside of the specified interval.


Here's the algorithm for mapping of ranges of integers that is provably
correct. 

We aim to find a function sc(n) defined over integers 0 <= n <= M so
that
    sc(0) -> s (given integral number)
    sc(M) -> t (given integral number), t>=s
and
    0<=a < b ==> sc(a) <= sc(b)

Such a function sc(n) is given by the following expression:
    sc(n) = (n*(t-s)) `div` M + s

Proof:
    sc(0) = s
    sc(M) = (M*(t-s)) `div` M + s = t
    a<b ==> a*(t-s) < b*(t-s) ==> (a*(t-s)) `div` M <= (b*(t-s)) `div` M

The assertion that 0<=a<b ==> a `div` M <= b `div` M for M >0 can
easily be proven by contradiction. Indeed, given the facts
	(a `div` M)* M = a - ra, 0<= ra <= M-1
	(b `div` M)* M = b - rb, 0<= rb <= M-1
and an assumption
	a `div` M > b `div` M
we see
	a = M*(a `div` M) + ra 
      = M*(b `div` M + x) + ra { where x >= 1 }
      = b - rb + ra + M*x
      >= min{ b - rb + ra + M*x over ra in [0,M-1], rb in [0,M-1]}
      >= b - (M-1) + M*x
      >= b + 1 + M*(x-1) { x >= 1 }
      >  b {contradiction}

Thus the correct algorithm reads

> scaleSequence :: Int -> Int -> [Int] -> [Int]
> scaleSequence s t
>   = map scale
>     where
>       scale n = (n*range) `div` maxn + s
>       range   = t - s
>       maxn    = modulus - 1
>       modulus = 65536

Given your example,
Main> any (==966) (scaleSequence 1 966 ([65231] ++ [1..965] ++[65565]))
True
Main> any (>966) (scaleSequence 1 966 ([65231] ++ [1..965] ++[65565]))
False