Improving Random

Wed May 27 08:23:51 UTC 2020

Hey Zemyla,

> And can you explain how to take an existing RNG and write it in this new format?

This question is a bit vague, so I'll try to elaborate on all aspects of new format. New interface supports both stateful and pure RNGs, so I assume you are asking how to use them with the new format, rather than how to create instances, because the latter is only important for maintainers of PRNG libraries and a user will never need to create any instances. If you are still curious about how to write them there is example for stateful mwc-random, which can be found in the generated haddock and in the beginning of this thread, as Dominic pointed out. For pure RNGs I included links to repositories that contain adjusted RandomGen instances below in this email.

As far as pure RNGs, such as the default splitmix and tf-random, etc. Any of them can be used with monadic interface by the means of any `MonadState`. There is a newtype wrapper `StateGen` that allows us to use new monadic interface without any runtime overhead and any other work on the user's part. Let's look at some examples of why this monadic interface is extremely powerful.

We can write our code in this monadic style once:

-- | Generate a random alpha string with random length up to a specified value.
randomStringN :: MonadRandom g s m => Int -> g s -> m (Int, String)
randomStringN upperBound gen = do
  n <- uniformRM (0, upperBound) gen
  xs <- replicateM n $ uniformRM ('a', 'z') gen
  pure (n, xs)

And then we can use such function with any of the available PRNG in Haskell. There are some helper functions that allow using pure RNGs that have RandomGen instance with this monadic style, like `runStateGen`

>>> runStateGen (mkStdGen 217) (randomStringN 10)
((6,"ihhtwd"),StdGen {unStdGen = SMGen 16329862649850020527 15251669095119325999})

which is equivalent to:

>>> runState (randomStringN 10 StateGenM) (mkStdGen 217)
((6,"ihhtwd"),StdGen {unStdGen = SMGen 16329862649850020527 15251669095119325999})

There is also IOGen, that is a wrapper around IORef, which can be used in places where `StateT` can't be used, such as in presence of `bracket` for example:

>>> gen <- thawGen (IOGen (mkStdGen 217))
>>> randomStringN 10 gen
(6,"ihhtwd")

It is just as easily can be used with mwc-random or pcg-random packages in IO:

>>> import System.Random.MWC as MWC
>>> gen <- MWC.createSystemRandom
>>> randomStringN 10 gen
(7,"gfcfodz")

and of course in ST:

>>> import Control.Monad.ST
>>> seed <- MWC.save =<< MWC.createSystemRandom
>>> runST $ thawGen seed >>= randomStringN 10
(9,"wuzudihnd")

It is important to note, that we didn't have to define any new monads or transformers, which saves us from a need to create thousands of different instances and force a user to use yet another monad. All of our machinery from Haskell ecosystem will work perfectly fine with this new interface.

With regards to Carter's remark:
> Yeah, absolutely agree that their pr has the wrong shape for monad random

It is possible to drop state token `s`, but we would have to loose `Frozen` type and ability to use functions like `thawGen`, `freezeGen` and consequently `runGenM` which  make `MonadRandom` even so more useful. That being said, if that is a deal breaker, I can submit a PR that removes this functionality.

Here is also an answer to David's question:

> Could you explain the reasoning behind the deprecations?

`next` and `genRange` are the actual  reason behind the slowness of `random` all these years. It is not feasible to convert an arbitrary `Int` sub range (which is what `next`+`genRange` give us) to another arbitrary subrange of another type efficiently. All prior versions of random go through `Integer` to generate ranges for almost all types, precisely because of this problem. The great part about this problem is that the poor quality `StdGen` in `random-1.1` is the only pure RNG in Haskell that relies in this property of having unusual min and max value in `genRange`. Most of RNGs out there produce pseudo random numbers in a n-bit range, where n is usually 32 or 64 bits. Once you have that property available it is possible to utilize it to get really fast generation of random numbers of any type in any range. Therefore this is the reason for addition of `genWord32`, `genWord64` ... and deprecation of `next` and `genRange`. Nice thing the approach we took is that it makes it backwards compatible, in other words any instance that only defined `next` and `genRange` will still work, albeit slower than it could if it were to implement either `genWord32` or `genWord64`.

In case you are curious, I've went through all packages that implement `RandomGen` instance and none of them will miss these two functions, here is a list of some of the repos with branches that contain fixed instances:
* https://github.com/lehins/quickcheck/tree/new-random
* https://github.com/lehins/tf-random/tree/new-random
* https://github.com/lehins/mersenne-random-pure64/tree/new-random
* https://github.com/lehins/pcg-random/tree/new-random
* https://github.com/lehins/pcgen/tree/new-random

Alexey.

PS. I didn't realize I my last email was not posted to libraries because my email was not registered, so I resent it again.

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Wednesday, May 27, 2020 5:13 AM, Carter Schonwald <carter.schonwald at gmail.com> wrote:

> Zemyla!
>
> Yeah, absolutely agree that their pr has the wrong shape for monad random, which is why I’m confused by dominics pronouncement :), though its a great contrib.
>
> It should be (ignoring m being applicative vs monad)
> (For context, their design is roughly  ... MonadRandom m s g | m,g -> s ?  I’m typing on a phone so i might be restating it wrong ... :) )
>
> The *better* choice is
> Class PRNG g =>  MonadRandom m g | m-> g where ...
>
> And then their example instance  should be something like
>
> Newtype MWCT  m a = ... newtype wrapper around StateT m a threading an MWC indexed by the state token of the underlying PrimMonad
> (Or newtype MWCT s m a = .... stateT thing)
>
> Instance (PrimMonad m) => MonadRandom  (MWCT m) (MCWGen (PrimState m)) where ...
>
> Or something along those lines.  It absolutely shouldn’t be in the style of being on any PrimMonad, but part of a stack that provides that instance.
>
> This is ignoring the freeze/thaw stuff which really is just “record and restore RNG state”, and i think is  an artifact of ther design choice in their PR. Which has a lot of great stuff I’m reviewing and poking at.
>
> Nothing is set in stone about deprecation and migration story, cause improving everything and making sure everyone has a zero pain/ or at least tolerable upgrade path is step zero.  (And making sure good stable interfaces persists is key! )
>
> I hope everyone is having a safe and healthy start to their summers,
>
> and I’m sorry for any confusions Dominics email about his collab Pr announcement has created.  But cest lie vie! Though sharing excitement about a great Pr is great!
>
> Stay well!
> -Carter
>
> On Tue, May 26, 2020 at 6:48 PM Zemyla <zemyla at gmail.com> wrote:
>
>> And can you explain how to take an existing RNG and write it in this new format?
>>
>> On Tue, May 26, 2020, 15:54 David Feuer <david.feuer at gmail.com> wrote:
>>
>>> Could you explain the reasoning behind the deprecations?
>>>
>>> On Tue, May 26, 2020, 6:00 AM <dominic at steinitz.org> wrote:
>>>
>>>> Hello Libraries,
>>>>
>>>> You may recall that following the blog [post](https://alexey.kuleshevi.ch/blog/2019/12/21/random-benchmarks/) by @lehins, a group of us (@curiousleo, @lehins and me) invited participation in [February](https://mail.haskell.org/pipermail/libraries/2020-February/030261.html) to take this work and apply it to improving the current random library.
>>>>
>>>> Our proximate goals were to fix [#25](https://github.com/haskell/random/issues/25) (filed in 2015) and [#51](https://github.com/haskell/random/issues/51) (filed in 2018). After a lot of discussion and experimentation, we have a proposal that addresses both these issues and also: [#26](https://github.com/haskell/random/issues/26), [#44](https://github.com/haskell/random/issues/44), [#53](https://github.com/haskell/random/issues/53), [#55](https://github.com/haskell/random/issues/55), [#58](https://github.com/haskell/random/issues/58) and [#59](https://github.com/haskell/random/issues/59).
>>>>
>>>> For backwards compatibility, the proposal retains the old style classes and enhances them. Thus in 1.1 we have
>>>>
>>>> class RandomGen g where
>>>>     next :: g -> (Int, g)
>>>>     genRange :: g -> (Int, Int)
>>>>     split :: g -> (g, g)
>>>>     {-# MINIMAL next, split #-}
>>>>
>>>> and in 1.2 we have
>>>>
>>>> class RandomGen g where
>>>>     next :: g -> (Int, g)
>>>>     genWord8 :: g -> (Word8, g)
>>>>     genWord16 :: g -> (Word16, g)
>>>>     genWord32 :: g -> (Word32, g)
>>>>     genWord64 :: g -> (Word64, g)
>>>>     genWord32R :: Word32 -> g -> (Word32, g)
>>>>     genWord64R :: Word64 -> g -> (Word64, g)
>>>>     genShortByteString :: Int
>>>>         -> g -> (Data.ByteString.Short.Internal.ShortByteString, g)
>>>>     genRange :: g -> (Int, Int)
>>>>     split :: g -> (g, g)
>>>>     {-# MINIMAL split, (genWord32 | genWord64 | next, genRange) #-}
>>>>
>>>> and next and genRange are deprecated. This interface is what allows the significantly faster performance as no longer is everything forced to go via Integer.
>>>>
>>>> Several new interfaces are introduced and it is recommended that new applications use these and, where feasible, existing applications migrate to using them.
>>>>
>>>> The major API addition in this PR is the definition of a new class MonadRandom:
>>>>
>>>> -- | 'MonadRandom' is an interface to monadic pseudo-random number generators.
>>>> class Monad m => MonadRandom g s m | g m -> s where
>>>> {-# MINIMAL freezeGen,thawGen,(uniformWord32|uniformWord64) #-}
>>>>
>>>>     type Frozen g = (f :: Type) | f -> g
>>>>     freezeGen :: g s -> m (Frozen g)
>>>>     thawGen :: Frozen g -> m (g s)
>>>>
>>>>     uniformWord32 :: g s -> m Word32 -- default implementation in terms of uniformWord64
>>>>     uniformWord64 :: g s -> m Word64 -- default implementation in terms of uniformWord32
>>>>     -- plus methods for other word sizes and for byte strings
>>>>     -- all have default implementations so the MINIMAL pragma holds
>>>>
>>>> Conceptually, in MonadRandom g s m, g s is the type of the generator, s is the state type, and m the underlying monad. Via the functional dependency g m -> s, the state type is determined by the generator and monad.
>>>>
>>>> Frozen is the type of the generator's state "at rest". It is defined as an injective type family via f -> g, so there is no ambiguity as to which g any Frozen g belongs to.
>>>>
>>>> This definition is generic enough to accommodate, for example, the Gen type from mwc-random, which itself abstracts over the underlying primitive monad and state token. This is the full instance declaration (provided here as an example - this instance is not part of random as random does not depend on mwc-random):
>>>>
>>>> instance (s ~ PrimState m, PrimMonad m) => MonadRandom MWC.Gen s m where
>>>>     type Frozen MWC.Gen = MWC.Seed
>>>>     freezeGen = MWC.save
>>>>     thawGen = MWC.restore
>>>>
>>>>     uniformWord8 = MWC.uniform
>>>>     uniformWord16 = MWC.uniform
>>>>     uniformWord32 = MWC.uniform
>>>>     uniformWord64 = MWC.uniform
>>>>     uniformShortByteString n g = unsafeSTToPrim (genShortByteStringST n (MWC.uniform g))
>>>>
>>>> Pure random number generators can also be made instances of this class providing a uniform interface to both pure and stateful random number generators. An instance for the standard number generator StdGen is provided.
>>>>
>>>> The Random typeclass has conceptually been split into Uniform and UniformRange. The Random typeclass is still included for backwards compatibility. Uniform is for types where it is possible to sample from the type's entire domain; UniformRange is for types where one can sample from a specified range:
>>>>
>>>> class Uniform a where
>>>>     uniformM :: MonadRandom g s m => g s -> m a
>>>>
>>>> class UniformRange a where
>>>>     uniformRM :: MonadRandom g s m => (a, a) -> g s -> m a
>>>>
>>>> The proposal is a breaking change but the changes are not very intrusive and we have PRs ready for the affected downstream libraries:
>>>>
>>>> - requires base >= 4.10 (GHC-8.2)
>>>> - StdGen is no longer an instance of Read
>>>>
>>>> - randomIO and randomRIO were extracted from the Random class into separate functions
>>>>
>>>> In addition, there may be import clashes with new functions, e.g. uniform and uniformR.
>>>>
>>>> Further explanatory details may be found [here](https://github.com/idontgetoutmuch/random/blob/v1.2-release-notes/RELEASE-NOTES-v1.2.md#api-changes) and the PR for the proposed new version is [here](https://github.com/haskell/random/pull/61).
>>>>
>>>> Here are some benchmarks run on a 3.1 GHz Intel Core i7. The full benchmarks can be run using e.g. stack bench. The benchmarks are measured in milliseconds per 100,000 generations. In some cases, the performance is over x1000(!) times better; the minimum performance increase for the types listed below is more than x35.
>>>>
>>>> | Name                    | Mean (1.1) | Mean (1.2) | Improvement|
>>>> | ----------------------- | ---------- | ---------- | ---------- |
>>>> | pure/random/Float       |         30 |       0.03 |        1038|
>>>> | pure/random/Double      |         52 |       0.03 |        1672|
>>>> | pure/random/Integer     |         43 |       0.33 |         131|
>>>> | pure/uniform/Word8      |         14 |       0.03 |         422|
>>>> | pure/uniform/Word16     |         13 |       0.03 |         375|
>>>> | pure/uniform/Word32     |         21 |       0.03 |         594|
>>>> | pure/uniform/Word64     |         42 |       0.03 |        1283|
>>>> | pure/uniform/Word       |         44 |       0.03 |        1491|
>>>> | pure/uniform/Int8       |         15 |       0.03 |         511|
>>>> | pure/uniform/Int16      |         15 |       0.03 |         507|
>>>> | pure/uniform/Int32      |         22 |       0.03 |         749|
>>>> | pure/uniform/Int64      |         44 |       0.03 |        1405|
>>>> | pure/uniform/Int        |         43 |       0.03 |        1512|
>>>> | pure/uniform/Char       |         17 |       0.49 |          35|
>>>> | pure/uniform/Bool       |         18 |       0.03 |         618|
>>>> | pure/uniform/CChar      |         14 |       0.03 |         485|
>>>> | pure/uniform/CSChar     |         14 |       0.03 |         455|
>>>> | pure/uniform/CUChar     |         13 |       0.03 |         448|
>>>> | pure/uniform/CShort     |         14 |       0.03 |         473|
>>>> | pure/uniform/CUShort    |         13 |       0.03 |         457|
>>>> | pure/uniform/CInt       |         21 |       0.03 |         737|
>>>> | pure/uniform/CUInt      |         21 |       0.03 |         742|
>>>> | pure/uniform/CLong      |         43 |       0.03 |        1544|
>>>> | pure/uniform/CULong     |         42 |       0.03 |        1460|
>>>> | pure/uniform/CPtrdiff   |         43 |       0.03 |        1494|
>>>> | pure/uniform/CSize      |         43 |       0.03 |        1475|
>>>> | pure/uniform/CWchar     |         22 |       0.03 |         785|
>>>> | pure/uniform/CSigAtomic |         21 |       0.03 |         749|
>>>> | pure/uniform/CLLong     |         43 |       0.03 |        1554|
>>>> | pure/uniform/CULLong    |         42 |       0.03 |        1505|
>>>> | pure/uniform/CIntPtr    |         43 |       0.03 |        1476|
>>>> | pure/uniform/CUIntPtr   |         42 |       0.03 |        1463|
>>>> | pure/uniform/CIntMax    |         43 |       0.03 |        1535|
>>>> | pure/uniform/CUIntMax   |         42 |       0.03 |        1493|
>>>>
>>>> Dominic Steinitz
>>>> dominic at steinitz.org
>>>> http://idontgetoutmuch.org
>>>> Twitter: @idontgetoutmuch
>>>>
>>>> _______________________________________________
>>>> Libraries mailing list
>>>> Libraries at haskell.org
>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>>
>>> _______________________________________________
>>> Libraries mailing list
>>> Libraries at haskell.org
>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>
>> _______________________________________________
>> Libraries mailing list
>> Libraries at haskell.org
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/libraries/attachments/20200527/d653b9c1/attachment.html>