genBits: small addition to random API + largely new implementation that needs code review

Wed Jun 29 16:55:07 CEST 2011

On June 28, 2011 20:31:23 Scott Turner wrote:
> On 2011-06-28 14:58, Ryan Newton wrote:
> > I propose that we add a "genBits" function that reports how many bits of
> > randomness are created by a generator.
> > 
> >   class RandomGen g where
> >   
> >     ...
> >     genBits :: g -> Int
> 
> ...
> 
> > What about generators that have a genRange which is not a power of two?
> 
> My own feeling is against this change. That's not based on a lot. I once
> coded up a mechanism that could convert a stream of random numbers from
> a given sequence of ranges, to another stream of random numbers for
> ranges as demanded. No randomness was wasted. Nice, clean fun. I don't
> claim it was particularly efficient.
> 
> Having worked with that, using bits to organize randomness seems arbitrary.

I created a system that is like much like what you did.  It was based around 
uniform draws on variable intervals and splitting and combining them.

The key component was the following function

(source',source_bound', value',value_bound')
  = combine(source,source_bound, value,value_bound, range)

which is a uniform random integer combiner, where source, source', value, and 
value' are uniform random integers such that

   0 <= source  < source_bound
   0 <= source' < source_bound' <= source_bound
   0 <= value   < value_bound   <= range
   0 <= value'  < value_bound'  <= range

What it does is shift randomness out of source and into value.  This would 
increase the source_bound and decrease value_bound.  To generate a uniform 
number on an arbitrary interval, you just call it repeatedly until 
source_bound reached your desired bound, refreshing value with a new random 
number from your generator between calls anytime value_bound dropped to 1.

The interesting thing about this is the only constraint on the underlying 
generator is it had to generate uniform numbers.  The interval they were 
generated on didn't matter at all.  They could even change with each call.  It 
is also interesting in that, properly coded, it can waste no randomness by 
putting any left over randomness back into value for the next round.

You can also code it up pretty efficiently.  It is basically all just integer 
multiplication (combining intervals) and division and remainder operations 
(splitting intervals).  With a bit of care you can also isolate the source and 
value types (e.g., source could be 8 bit and value could be 32bit).

Still, for a very cheap source generating on a larger interval than you 
desired one, it likely wouldn't compete with the classic modulus, test, and 
throw away and repeat if not acceptable approach.  Way more flexible though.

Cheers!  -Tyson

PS:  combine is pretty easy to define in terms of

(source',source_bound', value',value_bound')
  = split(source,source_bound, value_bound)

which is a uniform random integer splitter, where source, source', and value' 
are uniform random integers are such that

   0 <= source  < source_bound
   0 <= source' < source_bound' <= source_bound
   0 <= value'  < value_bound'  <= value_bound

What it does is split off a uniform draw on an interval up to the one requested 
form value.  I can provide full pseudo code if anyone wants.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 490 bytes
Desc: This is a digitally signed message part.
URL: <http://www.haskell.org/pipermail/libraries/attachments/20110629/d5c07d02/attachment.pgp>