[Haskell-beginners] randomize the order of a list

Mon Aug 30 05:14:36 EDT 2010

Gaius Hammond wrote:
> I am trying to randomly reorder a list (e.g. shuffle a deck of cards) .
> My initial approach is to treat it as an array, generate a list of
> unique random numbers between 0 and n - 1, then use those numbers as new
> indexes.

Felipe Lessa wrote:
> Note: you could use random-shuffle package [1].
>
> [1] http://hackage.haskell.org/package/random-shuffle

Heinrich Apfelmus wrote:
> For another approach, see also
>
>    http://apfelmus.nfshost.com/articles/random-permutations.html

Thanks for the link apfelmus, it's fairly interesting. The key to
making it work is the weighting of lists during merging based on their
lengths. I wonder if other sort algorithm can be adapted in such a
way, while preserving uniformity. Quicksort for example : is it enough
to choose the result position of the pivot randomly, and then placing
elements on either side with a probability of 1/2 ?

There is another sort-based approach that is probably less elegant,
but simpler in the sense that you don't have to rewrite your sorting
routines : pick random "weights" for each of your elements, and sort
the list based on the weights. This approach has been studied and
criticized by [Oleg], whose analysis is the base of the random-shuffle
package. Oleg rightly points out that this method does not achieve
uniformity in general. What he failed to notice is that there is a
simple workaround to regain uniformity with minimal changes.

 [Oleg] http://okmij.org/ftp/Haskell/perfect-shuffle.txt

The uniformity breaker is the case were two elements get equal
weights. In his example, weights for a 2-size lists are choosed in {0,
1}. In the two corner case were both elements get picked the same
weight, the sorting routine make an arbitrary choice : for example, if
it is stable, it will put first the element occuring sooner in the
list. This arbitrary choice breaks uniformity.

The workaround is simple : pick weights that are all differents. If
the weights are picked among [1, n] were n is the size of the list, it
is difficult and inneficient to generate a list of unique random
numbers. But you can choose weights in arbitrary intervals. The bigger
it is, the smaller the chance you have to pick weights again becomes
(though due to the birthday paradox, it does not decrease that fast).

To sum it up, here is the algorithm to shuffle a list :

 - for each element in the list, pick a random list
 - pick another weight list while two equal weights were picked
 - once you've picked *unique* weights, shuffle the elements of the
list by comparing their weights

If the weight are taken in a big enough interval (say [1, 2^64]), the
average number of repicks necessary is small enough for this algorithm
to be just a sorting using your preferred algorithm. I haven't went
through the trouble of computing the necessary interval bound to get a
small enough chance of conflict, but I believe the forth power of the
size of the list should be enough.

An even better technique is to use lazy lists of {0, 1} (representing
real numbers) as weights, with each list element lazily randomly
computed. You are sure than no two such lists are equal (they're
different with probability 1). This amounts to a dynamical random
refining of weights during the sorting : "two of my real number
weights are equal at granularity 2^-N ? Let's make a random choice so
that their {N-1}th digit is different !"