[Haskell-cafe] How to select n random words from a file ...
jerzy.karczmarczuk at unicaen.fr
Mon Jun 11 11:29:02 CEST 2012
> An interesting related problem is if you are only allowed one pass
> through the data how would you randomly choose one word.
Let's choose n items.
You must know the length of the sequence, of course, otherwise the
'probability' loses its sense. So, for lists it might not be "just one
Suppose the length of the sequence be m.
Suppose you have a random generator called rg, just a simple function
which transforms: seed -> seed' , between 0 and 1
Make n and m real to make the typechecker happy.
Then the most straightforward solution for lists is:
nran l n = nr l m n seed where
m = fromIntegral(length(l))
nr  _ _ _ = 
nr (x:q) m n seed =
let seed'=rg seed
in if seed' < n/m then x:nr q (m-1) (n-1) seed'
else nr q (m-1) n seed'
Now, you may make it tail-recursive, use a different random generation
protocol, or whatever. I believe that this solution is known for years...
More information about the Haskell-Cafe