[Haskell-beginners] sorting almost sorted list

Tue Sep 27 18:34:01 CEST 2011

Hi Daniel,

Thanks for clearing up the mystery
about the dependence on prime factorization!
Very interesting.

> at first glance, merging c
> chunks of length m each is O(c*m); so, since sorting each chunk is
> O(m*log m), it'd overall be
>
> O(c*m*log m) = O(n*log m) = O(n*log (n/c))

No. In the given use case m is a small constant and
c is proportional to n, so the overall complexity is O(n).

> It has larger overhead than sortBy,

That's true. But it's asymptotically better.

Given Dennis' use case, it sounds like
m will probably always be < 10.
Input size will typically be a few hundred,
but could be a few thousand. sortBy
would indeed be just fine in those cases.

> so if n is small or m is large, sortBy
> is likely to be better, but if n is large and m relatively small,
> sortAlmostBy can be significantly (but not dramatically,
> excluding extreme cases where you can do much better) faster.

At most m will be on the order of 100 if we
need to process all the parts of a large
symphony orchestra combined together
(a very unlikely use case admittedly),
and in that case the input size could be as
much as 10^6, or possibly even more for
something like the Mahler.

(Come to think of it, it's pretty amazing that
Mahler wrote so many notes and actually got
people to play them.)

There I believe sortAlmostBy could be quite
an improvement. I'd be interested to hear
ideas to make it even better though.

Thanks,
Yitz