[Haskell-cafe] Infelicity in StdGen?

Mon Jan 6 18:21:24 UTC 2014

On 3 Jan 2014, at 19:22, Krzysztof Skrzętnicki wrote:

> I think the confusion may be come from the understanding of "distinct". The documentation is right that the generators are not equal which is easily checked e.g. using their Show instance. They will produce different random numbers. The user of the library might OTOH assume that "distinct" mean "producing uncorrelated output".

I agree that my example doesn't refute the precise meaning of the words.
May I suggest that the statement that the generators are likely to be
distinct on distinct inputs isn't really that useful to someone using StdGen.

> This is harder and may simply not hold, especially that it doesn't mention sequentially increasing integers or any other kinds of sequences.
> 
> The property you seem to be looking for is "have vastly different output for similar numbers". Sounds a lot like a hash function to me. 
> 

I think most people would expect the function that maps the seed of a
pseudo-random number generator to the
first (or second or third or ...) value it generates to be a
reasonably good hash function. As this turns out not to be
the case for the algorithm used by StdGen for certain lengths
of the range, the statement about distinct generators is
somewhat misleading.

I had a look at the StdGen source and don't know enough about
the algorithm it is using to comment on why it has this surprising
behaviour for some lengths of the range. It really is surprising
in my view: one way of describing the behaviour is that if you
use StdGen to generate a random boolean with seeds n and n+1,
the probability that the two values are different is less than 1/50,000.

Thanks for suggesting the useful work-arounds.

Regards,

Rob.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20140106/e23d0081/attachment.html>