Some clarity please! (was Re: Re: (flawed?) benchmark : sort)

Thu Mar 13 06:45:51 EDT 2008

On 2008-03-13, Adrian Hey <ahey at iee.org> wrote:
> Hello All,
>
> I'm top posting because I'm getting bored and frustrated with this
> thread and I don't want to respond detail to everything Aaron has said
> below.
>
> Aaron: Where are you getting this equivalence stuff from?

Not from the prose in the report, but from what the code in the report
seems designed to support.  There are several places where the code
seems to take a (small -- much usually isn't needed) bit of care to
support equivalencies.

> So I don't know if all this is really is the correct reading of the
> report, but if so would like to appeal to movers and shakers in the
> language definition to please decide exactly what the proper
> interpretation of standard Eq and Ord "class laws" are and in the
> next version of the report give an explanation of these in plain
> English using terms that people without a Mathematics degree are
> likely to understand.

I agree that the prose of the report should be clarified.

Luke Palmer's message in haskell-cafe captures why I think that "Eq
means equivalence, not strict observational equality" is a more
generally useful notion.  It's harder to guarantee observational
equality, thus harder to use code that requires it of your types.
Almost all the time (in my experience) equivalencies are all that's
generically needed.

My comments on this particular message are below.

> Because maximumBy documentation is ambiguous in this respect, so is the
> overloaded maximum documentation. At least you can't figure it out from
> the Haddock.

True.

> Despite this ambiguity, the statement that maximum is a special case of
> maximumBy is true *provided* max in the Ord instance is defined the way
> Aaron says is should be: (x==y = True) implies max x y = y.

Well, the way the report specifies that max's default definition
is.  I'd actually favor making that not an instance function at
all, and instead have max and min be external functions.

> AFAICT, the original report authors either did not perceive an
> ambiguity in maximum, or just failed to notice and resolve it.
> If there is no ambiguity this could be for 2 reasons.
>
> 1 - It doesn't matter which maximum is returned because:
>   (x==y) = True implies x=y
>
> 2 - It does matter, and the result is guaranteed to be the
> last maximum in all cases because:
>   (x==y) = True implies max x y = y

The second holds, so long as max isn't redefined.  I'd be rather
surprised at any redefinitino of max, as it's not part of any minimum
definition for Ord, and I can't think of an actual optimization case
for it.

> Regarding the alleged "max law" this too is not mentioned in the
> Haddock for the Ord class, nor is it a "law" AFAICT from reading the
> report. The report (page 83) just says that the default methods are
> "reasonable", but presumably not manditory in any semantic sense.
> This interpretation also seems to be the intent of this from the
> second para. of Chapter 8:

Agreed.  Elevating this to a "law" in my previous message was a mistake
on my part.  I still think this default in combination with the comment
is very suggestive that (min x y, max x y) should preserve distinctness of
elements.

If you're unwilling to count on this holding for arbitrary Ord
instances, why are you willing to count on (/=) and (==) returning
opposite answers for arbitrary Eq instances?

> I wouldn't dispute that the default definition is reasonable, but it's
> certainly not clear to me from the report that it's something that I
> can safely assume for all Ord instances. In fact AFAICS the report
> quite clearly telling me *not* to assume this. But I have to assume
> *something* for maximum to make sense, so I guess that must be:
>   (x==y) = True implies x=y
> IOW "I have no idea if it's the first or last maximum that is returned,
> but who cares?"

Well, you have to assume something for maximum to do what it says it
does, which isn't quite the same thing as "making sense"...

> I'm not saying some people are not right to want classes with more
> mathematically inspired "laws", but I see nothing in the report to
> indicate to me that Eq/Ord are those classes and consequently that
> the "naive" programmers interpretation of (==) is incorrect. Rather
> the contrary in fact.

It's not a question of more or less mathematically inspired, it's a
question of more or less useful.  Yes, it's slightly harder to write
code that can handle any equivalency correctly.  But code that only
handles observational equality correctly is less reuseable.  The
compilers cannot and do not check if the various laws are obeyed.  They
are purely "social" interfaces, in that the community of code writers
determines the real meaning, and what can be depended on.  The community
absolutely should come to a consensus of what these meanings are and
document them better than they are currently.

Currently, if you write code assuming a stricter meaning of Eq a, the
consequences are:
(a) it's easier for you to write code
(b) it's harder for others to interoperate with your code and use it.

Generally, you're the one that gets to make this trade off, because
you're writing the code.  Whether someone else uses your code, or
others', or writes their own is then their own trade off.  Because,
indeed, many many types inhabiting Eq do obey observational equality,
the consequences of (b) may be minor.

With regards to Haskell 98, my best guess is that some of the committee
members thought hard about the code so that Eq a would usually work for
any equivalence class, and others took it to mean observational equality
and wrote prose with this understanding.

-- 
Aaron Denney
-><-