[Haskell-beginners] Re: randomize the order of a list

Sat Sep 11 07:14:24 EDT 2010

John Dorsey wrote:
> Gabriel Scherer wrote:
> 
>>> Thanks for the link apfelmus, it's fairly interesting. The key to
>>> making it work is the weighting of lists during merging based on their
>>> lengths. I wonder if other sort algorithm can be adapted in such a
>>> way, while preserving uniformity. Quicksort for example : is it enough
>>> to choose the result position of the pivot randomly, and then placing
>>> elements on either side with a probability of 1/2 ?
> 
> to which Heinrich Apfelmus answered:
> 
>> Interesting question! Adapting quick sort is not that easy, though.
>>
>> First, you can skip choosing the pivot position because it is already  
>> entailed by the choices of elements left and right to it.
> 
> I don't think this is true...
> 
>> Second, probability 1/2 won't work, it doesn't give a uniform  
>> distribution.
> 
> .... because of this.
> 
> In fact, it appears to me that the proposed modification to quicksort is
> uniform and simple.  Why do you think otherwise?

Why should it be uniform just because it looks nice? Looks can be
deceiving, you need a mathematical proof to be certain.

Embarrassingly, the analysis in my previous message is wrong, though.
Here an actually correct assessment of the algorithm. Or rather, of the
two algorithms; the results are different depending on whether you use a
pivot *element* or just a pivot *position*. It will turn out that the
former is not uniform, while, to my surprise, the latter is uniform!

Let's being with some code for the algorithms, so we know what exactly
we are analyzing here. First, the  partition  function which divides a
list into two parts where each element has a chance of 1/2 of landing in
either part:

     partition :: [a] -> Random ([a],[a])
     partition = go ([],[])
         where
         go (ls,rs) []     = return (ls,rs)
         go (ls,rs) (x:xs) = do
             b <- uniform [True,False]   -- flip a coin
             if b
                 then go (x:ls,rs) xs    -- element goes left
                 else go (ls,x:rs) xs    -- element goes right

Now, algorithm A which puts the pivot element between the two parts:

     quickshuffleA :: [a] -> Random [a]
     quickshuffleA []     = return []
     quickshuffleA [x]    = return [x]
     quickshuffleA (x:xs) = do
         (ls, rs) <- partition xs
         sls <- quickshuffleA ls
         srs <- quickshuffleA rs
         return (sls ++ [x] ++ srs)

And then algorithm B which splits the list into two parts without
putting a pivot element in between.

     quickshuffleB :: [a] -> Random [a]
     quickshuffleB []     = return []
     quickshuffleB [x]    = return [x]
     quickshuffleB xs     = do
         (ls, rs) <- partition xs
         sls <- quickshuffleB ls
         srs <- quickshuffleB rs
         return (sls ++ srs)

Note that algorithm B does not necessarily terminate, since it repeats
itself if  ls  or  rs  become empty by chance!

Analysis of algorithm A:

Imagine the course of the algorithm from beginning to end. We want to
keep track of the set P of permutations that are still possible as
outcomes during each step. Before the algorithm starts, every
permutation is still possible. Then, for the first call of  partition ,
imagine that the set of possible permutations is divided into  2^(n-1)
disjoint classes of the form

   l x rrr...rr
   l x rrr...rr   -- different  l  than the previous one
   ...
   ll x rr...rr
   ...
   lll x r...rr
   ...
   lll...ll x r

where the  l  and  r  denote elements from the parts  ls  and  rs
respectively. The call to the  partition  function picks one of these
classes at random, with a uniform distribution.

However, the problem is that these classes contain different amounts of
permutations! Namely, the class

                       ll..ll  x  rr..rr
     k elements on the left           n-1-k elements on the right

contains  k! * (n-1-k)!  permutations. So, to be uniform, the  partition
  function would have to return each class with a probability
proportional to its size. But this is not the case, so algorithm A
cannot return of a uniform distribution of permutations.

Another way to convince yourself of the non-uniformity of algorithm A is
to actually calculate the distribution for small  n  by using one of the
probabilistic functional programming packages on Hackage:

    http://hackage.haskell.org/package/ProbabilityMonads
    http://hackage.haskell.org/package/probability

Analysis of algorithm B:

As before, we can imagine that the first call  partition  splits the set
of possible permutations into 2^n classes. But this time, the classes
are no longer disjoint, so the previous analysis does not apply!

Without the pivot element, the situation has become much more symmetric,
though, and that's the reason why algorithm B gives a uniform
distribution. In particular, imagine that we somehow manage to calculate
the probability that  quickshuffleB [1,2,3,4]  will return the trivial
permutation [1,2,3,4] (we will perform that calculation in a moment,
too). But since the algorithm is highly symmetric, the same calculation
also applies to, say, the permutation [3,4,1,2], and all the other
permutation as well! For instance, for the result [1,2,3,4], we had to
consider the case  ls = [1,2] ; but this case corresponds to the case
ls = [3,4]  which appears in the calculation for the result [3,4,1,2].
Hence, all outcomes have equal probability.

Let's calculate this from first principles as well, i.e. let  p  be the
probability, that the result is the permutation [1,2,3,4]. This is only
possible if first call to  partition  gives one of the following five
results:

   ls = [], ls = [1], ls = [1,2], ls = [1,2,3], ls = [1,2,3,4]

By mathematical induction, we can assume that smaller inputs like
quickshuffle [1,2,3]  give a uniform distribution. Then, the probability
of the outcome  [1,2,3,4]  in each of the five cases is

    ls = []        => probability  2^(-4) * 1/0! *  p
    ls = [1]       => probability  2^(-4) * 1/1! * 1/3!
    ls = [1,2]     => probability  2^(-4) * 1/2! * 1/2!
    ls = [1,2,3]   => probability  2^(-4) * 1/3! * 1/1!
    ls = [1,2,3,4] => probability  2^(-4) *  p   * 1/0!

and their sum is

    p = 2^(-4) * (1/0!*p + 1/1!*1/3! + 1/2!*1/2! + 1/3!*1/1! + p*1/0!)
      = 2^(-n) * ( sum [1/k!*1/(n-k)! | k<-[0..n]] + 2*(p - 1/n!) )

Expressing the factorials in terms of binomial coefficients and applying
the binomial theorem, we can see that the sum is equal to  2^n / n!  and
we obtain

    p = 1/n! + 2^(-n)*2*(p - 1/n!)

This implies that  p - 1/n! = 0, i.e.  p = 1/n!  as desired.
Furthermore, the calculation does not depend on the fact that we were
considering the trivial permutation [1,2,3,4].

Regards,
Heinrich Apfelmus

--
http://apfelmus.nfshost.com