[Haskell-cafe] How to select n random words from a file ...
Jerzy Karczmarczuk
jerzy.karczmarczuk at unicaen.fr
Mon Jun 11 11:29:02 CEST 2012
KC:
> An interesting related problem is if you are only allowed one pass
> through the data how would you randomly choose one word.
Let's choose n items.
You must know the length of the sequence, of course, otherwise the
'probability' loses its sense. So, for lists it might not be "just one
pass"...
Suppose the length of the sequence be m.
Suppose you have a random generator called rg, just a simple function
which transforms: seed -> seed' , between 0 and 1
Make n and m real to make the typechecker happy.
Then the most straightforward solution for lists is:
nran l n = nr l m n seed where
m = fromIntegral(length(l))
nr [] _ _ _ = []
nr (x:q) m n seed =
let seed'=rg seed
in if seed' < n/m then x:nr q (m-1) (n-1) seed'
else nr q (m-1) n seed'
-- =========
Now, you may make it tail-recursive, use a different random generation
protocol, or whatever. I believe that this solution is known for years...
Jerzy Karczmarczuk
More information about the Haskell-Cafe
mailing list