[Haskell-cafe] why is Random in System?

Bryan O'Sullivan bos at serpentine.com
Wed Aug 17 23:55:59 CEST 2011

On Wed, Aug 17, 2011 at 12:27 PM, Ryan Newton <rrnewton at gmail.com> wrote:

> The more fundamental problem is that splitting is neither well understood
>> nor generally safe, and as such it should not be in the basic Random class.
> Would you mind elaborating?

Certainly. The purpose of splitting a PRNG is to create two new PRNGs, with
the following properties:

   - Long periods
   - The streams generated by each PRNG must be independent
   - Splitting must be fast, while preserving the above two properties

It's very easy to write a split function that gets at least one of these
considerations wrong.

> But I am under the impression that it is well understood by Burton Smith
> and others who have worked on the topic, and that they assure us that using
> AES, RNG's under any series of splits are as strong as those generated in a
> linear sequence.

The trouble is that very few people have worked on this topic - there's
almost no published literature on it. And Burton and his colleagues readily
admit that the technique they suggest is a matter of folk wisdom in the
crypto community.

A typical technical application of a PRNG is for Monte Carlo processes,
where you want to (a) have a very fast PRNG and (b) understand its
properties. Burton's off-the-cuff suggestion is all very well, but we don't
know important things like what the performance or period of that PRNG would
be. For instance, if we don't know a PRNG's period, we don't know when we're
going to start seeing autocorrelations in its output, or if it supports
splitting, how many splits are safe to perform before we start seeing *cross
*-stream correlation. This in turn means that we don't know whether, when,
or why the consumers of our pseudo-random numbers are going to themselves
start producing garbage results, and that's troubling to me.

Could you expound on this also?  The people I know in the parallelism
> community seem to care a lot about deterministic PRNG in parallel programs.

Yep, but don't conflate determinism with splitting. In the imperative world,
you normally know how many CPUs you have, so you initialize one PRNG per
CPU, and simply go from there; there's no need for splitting. In the
parallel community, people are going out of their way to *avoid* splitting.

The only good treatments of this subject that I know are published by the
SPRNG authors at FSU.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20110817/5a5dac4e/attachment.htm>

More information about the Haskell-Cafe mailing list