[Haskell-cafe] Re: (flawed?) benchmark : sort

Tue Mar 11 04:01:08 EDT 2008

Neil Mitchell wrote:
> Hi
> 
>>  (sort [a,b]) in the case we have: (compare a b = EQ)
>>
>>  Which of the following 4 possible results are correct/incorrect?
>>  1- [a,a]
>>  2- [a,b]
>>  3- [b,a]
>>  4- [b,b]
> 
> Fortunately the Haskell sort is meant to be stable,

I would have said it is meant to be *correct* first and *efficient*
second. You're ruling out a whole bunch of possibly more efficient
and correct sorts on the grounds that they may give observably different
results for a tiny minority of (IMO broken) Eq/Ord instances.

Wrt to *sortBy* (vs. *sort*), I would be inclined to agree with you.
I sure hope someone has proven that the (apparently) different sortBy
implementations provided by ghc,nhc and h98 library report are precisely
equivalent for all (type legal) function arguments.

This is also good reason for not hardwiring the definition of sort as..

sort = sortBy compare

This is just one of many possible correct sorts (including trie based
sorts).

> and sorting is
> meant to be a permutation, so we happily have the situation where this
> has a correct answer: 2.

> Anything else is incorrect.

Isn't 3 also a permutation? Why is it incorrect?

> Anyone submitting
> a revised sort through the Haskell libraries process will have to
> ensure the answer is 2. I hope someone does take the time to do this,
> as a faster base library will benefit everyone.

Agreed, for sortBy, but not sort. More generally different sortBys
should give identical results even for "crazy" comparisons (like
sortBy (\_ _ = EQ))

If that seems too severe, then maybe sortBy should be removed
altogether (if there's no obvious single correct and best
implementation). If this was done then sort would have to be
an (abstractly specified) Ord class method.

> Adrian: I think its fairly clear we disagree about these things.
> However, we both understand the others point of view, so I guess its
> just a question of opinion - and I doubt either of us will change. As
> such I think any further discussion may just lead to sleep deprivation
> [1]. I think I'm coming from a more discrete maths/theoretical
> background while you are coming from a more practical/pragmatist
> background.

If the "discrete maths/theoretical" POV necessatates to the kind of
biasing madness of Data.Map/Set (for example) then it *sucks* bigtime
IMO :-)

I've never understood precisely what "left biasing" means, and it seems
clear that in the past 2 fairly experienced and competent Haskellers
have between them failed to deliver an implementation that respects it's
own "biasing" promises. I would say it's not certain that even now
Data.Map is correct in this respect.

Having tried this approach myself too (with the clone) I can confirm
that *this way lies madness*, so in future I will not be making
any effort to define or respect "sane", unambiguous and stable behaviour
for "insane" Eq/Ord instances for any lib I produce or hack on. Instead
I will be aiming for correctness and optimal efficiency on the
assumption that Eq and Ord instances are sensible.

Regards
--
Adrian Hey