newbie:: getting random Ints

Thu, 04 Apr 2002 10:16:04 -0600

This may, in places, be somewhat pedantic --- I'm paid to be far more
concerned with formal proofs of correctness than any sane person
should be :) However, it should ultimately make sense if you'll read
the entire thing through.

Peter Rooney <pt@panix.com> wrote:

> in my case, it was not OK to have arbitrary data, i needed (pseudo-)
> random numbers for different runs of the program.

Normally, if you want program behavior to vary between runs of the
program, you want to use the IO monad.  This is a feature.

See, Haskell expressions usually satisfy the following property (call
it Axiom 1):

Assuming t is a point in the computation of the program (say, after t
machine-language instructions have been executed), let x(t) be the
value of x at point t.  Now, for all points t, t', x(t) = x(t').  In
other words, the value of a variable/program fragment is invariant
during computation.

This guarantees things like map f xn = map f xn, i.e., (=) (as used in
reasoning about Haskell programs) is an equivalence relation in the
mathematical sense.  (This is not true for languages like C, nor is it
necessarily true for Haskell in the presence of unsafePerformIO, a
point I'll return to later.)

This allows you to use normal equational reasoning to reason about
Haskell programs, which is of course both the most common and most
powerful form of reasoning about Haskell programming.

Axiom 1 has a consequence that's important here: Haskell program
phrases have a value that is invariant across runs of the program.
So, if you want a value of type [Int] whose value varies across
program invocations, that's not possible (well, as you know, you can
get it using unsafePerformIO, but that violates the implicit rules of
the language).

The IO monad allows you to use side-effects and have values that are
not invariant between runs of the program.  Basically, it works like
this: you should think of (IO a) as being equivalent to (State -> (a,
State)), where the argument is the state of the computer (mutable
variables, files, etc.) before the action executes and the result is
the state of the computer after the action executes, together with the
resultant value.  Because values inside the IO monad are within the
evaluation of a function with a particular value, you actually have
separate variables across runs, so their values can differ.

Now then, unsafePerformIO ``more or less'' has type (State -> (a,
State)) -> a.  In other words, it promises to deliver you the result
of the function /without knowing what that function's argument is/.
That's why it's ``unsafe'' --- you have to prove yourself that the
result satisfies Axiom 1.  If it doesn't, the program's results are
undefined, since any results depend on Axiom 1.  Basically, if you
know C, treat arbitrary unsafePerformIOs like you would treat:

x++ = x;

In other words, like the plague.

So how do you re-write your program without unsafePerformIO?

> i want to:

> -generate random Ints

As you know, this can be done in the IO monad.  Let your generator
have type IO [Int].

> -do some arbitrary computations on them to generate a [[Int]]

If you can do these computations outside the IO monad, do that.  Then,
use liftM (in the Monad module, I believe) to get a computation over
IO values.

In other words, let your computations be in a function f :: [Int] ->
[[Int]].  Then, liftM f :: IO [Int] -> IO [[Int]].  This function can
be applied to your random [Int] generator (remember, IO [Int] is a
function), generated above.

> -compare each [Int] in the list with a list of [Int] known at
> compile time

Again, do this outside the IO monad, then use liftM.

You can revise your code to do this like such:

> > import Random

> > rollDice :: IO Int
> > rollDice = getStdRandom (randomR (1,6))

This part is fine.

> > getRandomSeed :: IO Int
> > getRandomSeed = do
> >                 retval <- rollDice
> >                 return retval

Again, this is fine.  But, an orthogonal note about style:

>   do
>    retval <- rollDice
>    return retval

As you may or may not know, this is simply syntax sugar for:

> = rollDice >>= \retval -> return retval

Eta-contracting, we get:

> = rollDice >>= return

And, finally, applying the monad laws (in the Haskell Report, under
the Monad class specification), we get:

> = rollDice

So, all you've done is rename rollDice as getRandomSeed.

> > getRandomSeedInt :: IO Int -> Int
> > getRandomSeedInt x = unsafePerformIO x

Again, eta-contracting, we get:

> getRandomSeedInt = unsafePerformIO

So, another re-naming.

> > getARange :: Int -> Int -> [Int]
> > getARange x y  = randomRs (x,y) (mkStdGen (getRandomSeedInt getRandomSeed))

The simplest way to eliminate the (getRandomSeedInt = unsafePerformIO)
is to re-type it as:

> getARange :: Int -> Int -> IO [Int]

And re-write the definition as:

> getARange x y  = liftM randomRs (x,y) (liftM mkStdGen getRandomSeed)

Or, equivalently, as:

> getARange x y = do
>	    seed <- getRandomSeed
>	    return (randomRs (x, y) (mkStdGen seed))

> > getRandomInt :: Int -> Int
> > getRandomInt x = head (take 1 (getARange 0 x ))

Ignoring the typing issue for a moment, take satisfies the law:

forall i >= 1.head (take i xn) = head xn

So, we can re-write getRandomInt as:

> getRandomInt x = head (getARange 0 x)

Then, changing the type to IO Int gives:

> getRandomInt x = liftM head (getARange 0 x)

Or, equivalently, (and more verbosely)

> getRandomInt x = do
>	       range <- getARange 0 x
>	       return (head range)

To run this version, use:

liftM take 10 (getARange 0 10) >>= print

or

getARange 0 10 >>= print . take 10

((>>= print) prints out the result of the previous IO action).

If you want, you can define a helper function printTake:

> printTake n a = a >>= print . take n

This allows you to use:

> printTake 10 (getARange 0 10)

to test.

Jon Cast,
Equational Reasoning Weenie