[Haskell-cafe] why is Random in System?

Wed Aug 17 23:55:59 CEST 2011

On Wed, Aug 17, 2011 at 12:27 PM, Ryan Newton <rrnewton at gmail.com> wrote:

> The more fundamental problem is that splitting is neither well understood
>> nor generally safe, and as such it should not be in the basic Random class.
>>
>
> Would you mind elaborating?
>

Certainly. The purpose of splitting a PRNG is to create two new PRNGs, with
the following properties:

   - Long periods
   - The streams generated by each PRNG must be independent
   - Splitting must be fast, while preserving the above two properties

It's very easy to write a split function that gets at least one of these
considerations wrong.

> But I am under the impression that it is well understood by Burton Smith
> and others who have worked on the topic, and that they assure us that using
> AES, RNG's under any series of splits are as strong as those generated in a
> linear sequence.
>

The trouble is that very few people have worked on this topic - there's
almost no published literature on it. And Burton and his colleagues readily
admit that the technique they suggest is a matter of folk wisdom in the
crypto community.

A typical technical application of a PRNG is for Monte Carlo processes,
where you want to (a) have a very fast PRNG and (b) understand its
properties. Burton's off-the-cuff suggestion is all very well, but we don't
know important things like what the performance or period of that PRNG would
be. For instance, if we don't know a PRNG's period, we don't know when we're
going to start seeing autocorrelations in its output, or if it supports
splitting, how many splits are safe to perform before we start seeing *cross
*-stream correlation. This in turn means that we don't know whether, when,
or why the consumers of our pseudo-random numbers are going to themselves
start producing garbage results, and that's troubling to me.

Could you expound on this also?  The people I know in the parallelism
> community seem to care a lot about deterministic PRNG in parallel programs.
>

Yep, but don't conflate determinism with splitting. In the imperative world,
you normally know how many CPUs you have, so you initialize one PRNG per
CPU, and simply go from there; there's no need for splitting. In the
parallel community, people are going out of their way to *avoid* splitting.

The only good treatments of this subject that I know are published by the
SPRNG authors at FSU.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20110817/5a5dac4e/attachment.htm>