Left-bias and non-structural equality.

Okasaki, C. DR EECS Christopher.Okasaki at usma.edu
Thu Jan 5 09:41:25 EST 2006


I've been having this argument with people for close to 10 years now.

There are two similar but separate notions: equality and equivalence,
where equality means "equivalent in all contexts" and equivalence means
"equal in some contexts but not necessarily in others".  (I'll leave it
to others to give the base case of this recursive definition. :-)  The
problem is that Haskell has only a single Eq class, and it seems safe to
assume that that is not going to change anytime soon.  So which should
it be, equality or equivalence?  Many people want it to be equality, for
reasons of mathematical purity, while many others want it to be
equivalence, because it is so bloody useful.

As I see it, if you are going to restrict yourself to only one of the
two, equivalence is the better choice.  Why?  Because code designed to
work with equivalence will still work if you happen to give it an
equality relation, but code designed to work with equality can break if
you happen to give it an equivalence relation.

Another issue is that many *many* data structures can inherently only
provide equivalence.  The essential difference between equality and
equivalence has to do with being able to substitute "equals for equals".
For most data structures that have internal invariants, you can freely
substitute equals for equals *outside* the abstraction boundary, but not
*inside* the abstraction boundary.  For example, you can't replace the
left subtree of a balance binary search tree with another subtree with
the same elements but a different shape, and expect the whole tree to
still be balanced.  The only way around this problem is to have an extra
"wrapper" type and explicitly wrap/unwrap your data every time it
crosses the abstraction boundary.

The way I dealt with this in Edison was to provide two versions of many
operations, such as insert and insertWith, where insert is designed with
equality in mind and insertWith is designed with equivalence in mind.  I
have some sympathy for the argument that this pollutes the API, but in a
way that is easy to understand if you understand the
equality/equivalence issue.  You might think that assuming equality will
simplify the API, but your users would still need to understand the
equality/equivalence issue or they are going to accidentally give you
equivalence relations.

-- Chris


More information about the Libraries mailing list