Replace Random

dominic at steinitz.org dominic at steinitz.org
Mon Feb 17 13:12:55 UTC 2020


Hello libraries,

Following a great blog [post] by @lehins, a group of us (@curiousleo,
@lehins and me) are trying to improve the situation with the `random'
library.

@curiousleo and I have created a [resource] that tests the quality of
Haskell random number generators via well known (and not so well known)
test suites: dieharder, TestU01, PractRand and gjrand. The current
`random' does not fare well especially with the use of the `split'
function (but this is well known and the [reason] why `QuickCheck' moved
from using it in [2.8] to using [tf-random] in [2.9] and latterly
[splitmix] in [2.13]): see [this] for example. On the other hand,
`splitmix' [passes] bigcrush[1].

The putative proposal is to replace the current algorithm in `random'
with that of `splitmix'[2] and to remove the performance bottleneck by
changing the interface (the current interface causes the performance
infelicity by making "all of the integral numbers go through the
arbitrary precision Integer in order to produce the value in a desired
range") - see @lehin's blog for more details.

Can anyone interested:

* Create a separate issue for each concern they have (eg. range for
  floats (0, 1] vs [0, 1], etc.) [here].
* Submit PRs with target at the [interface-to-performance] branch (or
  into master if it is vastly different approach) with your suggested
  alternatives.

If you are going to raise a concern then it might be worth reading some
of the [discussions] that have already taken place.

We think once we have the API flashed out, switching to splitmix will be
a piece of cake and will require an addition of just a few lines of code
and removal of current StdGen related functionality. For historical
reasons instead of removing it we could move StdGen into a separate
module with a disclaimer not to use it, but that isn't terribly
important.

The Random Team (@lehins, @curiousleo, @idontgetoutmuch)


[post] https://alexey.kuleshevi.ch/blog/2019/12/21/random-benchmarks/

[resource] https://github.com/tweag/random-quality

[reason]
http://publications.lib.chalmers.se/records/fulltext/183348/local_183348.pdf

[2.8] https://hackage.haskell.org/package/QuickCheck-2.8

[tf-random] https://hackage.haskell.org/package/tf-random

[2.9] https://hackage.haskell.org/package/QuickCheck-2.9

[splitmix] https://hackage.haskell.org/package/splitmix

[2.13] https://hackage.haskell.org/package/QuickCheck-2.13

[this]
https://github.com/tweag/random-quality/blob/master/results/random-word32-split-practrand-1gb

[passes]
https://github.com/tweag/random-quality/blob/master/results/splitmix-word32-testu01-bigcrush

[here] https://github.com/idontgetoutmuch/random/issues

[interface-to-performance]
https://github.com/idontgetoutmuch/random/tree/interface-to-performance

[discussions] https://github.com/idontgetoutmuch/random/pull/1



Footnotes
_________

[1] Just to clarify: both random and splitmix pass BigCrush. random
fails any statistical test immediately (e.g. [SmallCrush]
(https://github.com/tweag/random-quality/blob/master/results/random-word32-split-testu01-smallcrush#L337-L349)
and other even smaller ones) when a sequence based on split is
used. splitmix passes Crush when split is part of the sequence, but
I've seen it fail one test in BigCrush ("LinearComp"). So we should
just be careful here: splitmix itself passes BigCrush and split-based
sequences all pass Crush, but not all pass BigCrush.

[2] `split' is already availaible as an instance: `instance Random
SMGen where'.



More information about the Libraries mailing list