[Haskell-cafe] Re: (flawed?) benchmark : sort

Fri Mar 14 17:39:15 EDT 2008

On 2008-03-14, Conor McBride <conor at strictlypositive.org> wrote:
> Hi
>
> On 13 Mar 2008, at 23:33, Aaron Denney wrote:
>
>> On 2008-03-13, Conor McBride <conor at strictlypositive.org> wrote:
>>> For a suitable notion of = on quotients, and with a
>>> suitable abstraction barrier at least morally in place,
>>> is that really too much to ask?
>>
>> I really think it is.  I don't think the case of "equivalent for this
>> purpose, but not that purpose" can be ignored.
>
> Sure. But use the right tools for the job.

So what are the right tools then?  Why is a typeclass not the right
tool?

>> Now, it may be the case that fooBy functions are then the right
>> thing, but it's not clear to me at all that this is true.
>>
>> And if the fooBy option works, then why would the foo option fail for
>> equivalence classes?
>
> It seems reasonable to construct quotients from
> arbitrary equivalences: if fooBy works for the
> carrier, foo should work for the quotient. Of
> course, if you want to expose the representation
> for some other legitimate purpose, then it wasn't
> equality you were interested in, so you should
> call it something else.

I'm perfectly happy calling it Equivalence.

>> -- what should a sort algorithm do in such a
>> situation?
>
> Not care. Produce a resulting list where for any
>
>    [..., x, ..., y, ...]
>
> in the result, y <= x implies x <= y. Vacuously
> satisfied in the case of incomparable elements.
> In the case of a total order, that gives you
> y <= x implies x = y (and everything in between),
> but for a preorder, you put less in, you get less
> out.

That's a workable definition, but I don't know if I'd call it a
sort, precisely.  The standard unix tool "tsort" (for "topological
sort", a bit of a misnomer) does this.

> Will that do?

Unfortunately, one can't just reuse the standard algorithms.  One
might think that one could reuse any standard algorithm by munging the
comparison so that incomparable gets mapped to equivalent, but the
following two chains shows that's not possible:

a -> b -> c -> d
a -> e -> d

Instead, it seems that one has to use actual graph algorithms, which
are both more complicated to reason about, and have worse performance.

If a sort can't support the standard "sort on this key" technique, and
don't munge everything for two keys that compare equal, something is
wrong.  And I don't think sort is that special a case.

Instances, rather than explicit functions, are nice because they let us
use the type system to ensure that we never have incompatible functions
used when combining two data structures, or pass in a function that's
incompatible with the invariants already in a data structure built with
another function.

So we surely do need an equivalence relation typeclass.  And there are
Eq instances that aren't quite equality, but are equivalences, and work
with almost all code that takes Eq instances.

The only time treating equalities as equivalences won't work is when we
need to coalesce equivalent elements into one representative, and the
choice of representative matters.  (If it doesn't matter, we can just
pick arbitrarily).  If it does matter, a simple biasing scheme usually
isn't going to be sufficient -- we really do need a coalescing function.

So, do we mark equivalencies as special, or observational equality as
special?  Which is the tagging class, and which has the methods?  I
think it's pretty clear that the way to go is have (==) and (/=) live
in Equiv, and have Equal be a tagging class.  An equivalence is not a
special type of equality, but equality is a special type of equivalence.

Given all that, I think current Eq as Equivalence makes sense, and we
need to add an Equal class for things where we truly can't tell
equivalent elements apart.

-- 
Aaron Denney
-><-