[Haskell-cafe] Re: (flawed?) benchmark : sort

Wed Mar 12 16:29:36 EDT 2008

On 2008-03-12, Adrian Hey <ahey at iee.org> wrote:
> Aaron Denney wrote:
>> On 2008-03-11, Adrian Hey <ahey at iee.org> wrote:
>>> Having tried this approach myself too (with the clone) I can confirm
>>> that *this way lies madness*, so in future I will not be making
>>> any effort to define or respect "sane", unambiguous and stable behaviour
>>> for "insane" Eq/Ord instances for any lib I produce or hack on. Instead
>>> I will be aiming for correctness and optimal efficiency on the
>>> assumption that Eq and Ord instances are sensible.
>> 
>> Good.  But sensible only means that the Eq and Ord instances agree, not that
>> x == y => f x == f y.
>
> So can I assume that max x y = max y x?

No.  You can, however, assume that max x y == max y x.  (Okay, this
fails on Doubles, because of NaN.  I'll agree that the Eq and Ord
instances for Double are not sane.)

> If not, how can I tell if I've made the correct choice of argument
> order.

When calling, or when defining max?

It depends on what types you're using, and which equivalence and
ordering relations are being used.

When calling, and when it might matter which representative of an
equivalence class comes back out (such as in sorts) you have no choice
but to look at the documentation or implementation of max.

The Haskell report guarantees that x == y => max x y = y (and hence max
y x = x), and the opposite choice for min.  This is to ensure that (min
x y, max x y) = (x,y) or (y,x).  IOW, the report notices that choice of
representatives for equivalence classes matters in some circumstances,
and makes it easy to do the right thing.  This supports the reading that
Eq a is not an absolute equality relation, but an equivalence relation.

> If I can't tell then I guess I have no alternative but document
> my arbitrary choice in the Haddock, and probably for the (sake of
> completeness) provide 2 or more alternative definitions of the "same"
> function which use a different argument order.

When defining max, yes, you must take care to make sure it useable for
cases when Eq is an equivalence relation, rather than equality.

If you're writing library code, then it won't generally know whether
Eq means true equality rather than equivalence.  If this would let
you optimize things, you need some other way to communicate this.

The common typeclasses are for generic, parameterizable polymorphism.
Equivalence is a more generally useful notion than equality, so that's
what I want captured by the Eq typeclass.

And no, an overloaded sort doesn't belong in Ord, either.  If the
implementation is truly dependent on the types in non-trivial,
non-susbstitutable ways (i.e. beyond a substition of what <= means),
then they should be /different/ algorithms.

It would be possible to right an "Equal a" typeclass, which does
guarantee actual observable equality (but has no methods).  Then you can
write one equalSort (or whatever) of type
equalSort :: (Eq a, Ord a, Equal a) => [a] -> [a]
that will work on any type willing to guarantee this, but rightly fail
on types that only define an equivalence relation.

A stable sort is more generally useful than an unstable one.  It's
composable for radix sorting on compound structures, etc.
Hence we want to keep this guarantee.

-- 
Aaron Denney
-><-