[Haskell-beginners] sorting almost sorted list

Tue Sep 27 13:46:52 CEST 2011

I wrote:
> After running some tests, it seems empirically that
> you can use m `div` 2 instead of m for lists of even
> length, which gives you a nice speedup for larger
> values of m. For lists of prime length, the best
> you can do seems to be m-1. It's more complicated
> for lists of non-prime odd length.
> I can't immediately see why any of that is true.

Well, obviously, that weird behavior has to do with
the regularity of my test data. If I randomize more,
the results become more believable. It seems that
m-1 or m-2 work empirically, but that could just
be because the probability of a collision is extremely
low.

The result for the more regularly shaped test data,
where the results have an interesting dependence
on prime factorization, is still fascinating.
I hope Daniel will comment. :)

Here is the more randomized test data
(sorry, I really should be using QuickCheck
for this instead of doing it by hand):

test n k = fmap (all isSorted . map (sortAlmostBy n compare)) $ rands2 k

isSorted xs = and . zipWith (<=) xs $ drop 1 xs

rands1 d = do
    cs <- replicateM (2000 `div` d) $ randomRS (0, d-1)
    fmap concat $ zipWithM mkChunk (scanl (+) 0 cs) cs
  where
    mkChunk x y = replicateM y $ randomRS (x, x+y-1)

rands2 d = fmap (evalState . replicateM 100 $ rands1 d) newStdGen