Weird profiling behaviour
Colin Runciman
colin@cs.york.ac.uk
Wed, 26 Jun 2002 12:19:15 +0100
Ketil Z. Malde wrote:
>I have what I think is a really strange problem. I have a fair sized
>problem, which involves sorting a data set, first on labels (which are
>Strings) and then on scores (which are Ints).
>
>The strange thing is that string sorting is *vastly* faster than int
>scoring! Now, I've tried modifying the sorting routinges, moving them
>into the same module, and so on, to no avail. Is there any likely
>explanation? The pipeline is basically
>
> sortBy int_cmp . compound_equal . sortBy string_cmp
>
>I hesitate to post the code, since it's part of a rather large
>program, and in the hope that somebody will pop up and say that oh,
>yes, it's a well know feature of 'sortBy'. Or that it's an artifact
>of profiling. Or something like that.
>
>Any, and I mean any, suggestions really, really welcome.
>
>-kzm
>
Could it be that the string-comparison sort simply has less sorting to do
than the int-comparison sort? The default definition of sortBy uses
insertion sort, so if the string-sort input happens to be already sorted
it takes linear time and if the int-sort input happens to be in
reverse order it takes quadratic time.
Colin R