[Haskell-beginners] Re: randomize the order of a list

Wed Sep 1 07:58:11 EDT 2010

Gabriel Scherer wrote:
> Heinrich Apfelmus wrote:
>> For another approach, see also
>>
>>    http://apfelmus.nfshost.com/articles/random-permutations.html
> 
> Thanks for the link apfelmus, it's fairly interesting. The key to
> making it work is the weighting of lists during merging based on their
> lengths. I wonder if other sort algorithm can be adapted in such a
> way, while preserving uniformity. Quicksort for example : is it enough
> to choose the result position of the pivot randomly, and then placing
> elements on either side with a probability of 1/2 ?

Interesting question! Adapting quick sort is not that easy, though.

First, you can skip choosing the pivot position because it is already 
entailed by the choices of elements left and right to it.

Second, probability 1/2 won't work, it doesn't give a uniform 
distribution. In order to get that, the remaining input  xs  has to be 
partitioned into two lists  xs = (ls,rs)  such that

     probability that  length ys == k  is   1/(n `over` k)

where

     n `over` k = n! / (k! * (n-k)!)

is the binomial coefficient. After all, calling "quickrandom" 
recursively on each of the two lists will give two permutations with 
probability  1/k!  and  1/(n-k)!  and the probability for a compound 
permutation is

    1/(n `over` k) * 1/k! * 1/(n-k)! = 1/n!

as desired. In contrast, distributing elements with probability 1/2 
would give

    probability that  length ys == k  is  (n `over` k) * 2^(-n)

which would skew the distribution heavily towards permutations where the 
pivot element is in the middle.

However, it is possible to divide the list properly, though I haven't 
worked out the exact numbers. The method would be

     divide (x:xs) = do
          (ls,rs) <- divide xs
          random  <- uniform (0, 1) :: Random Double
          if random <= p (length xs) (length ls)
              then return (x:ls, rs)
              else return (ls, x:rs)

where the probability  p  of putting the element  x  into the left part 
has to be chosen such that

    1/(n `over` k) =
        1/(n-1 `over` k-1) * p (n-1) (k-1)
      + 1/(n-1 `over` k  ) * (1 - p (n-1) k)

Regards,
Heinrich Apfelmus

--
http://apfelmus.nfshost.com