[Haskell-cafe] instance Enum Double considered not entirely great?

Wed Sep 21 03:49:02 CEST 2011

On Tue, Sep 20, 2011 at 6:05 PM, Chris Smith <cdsmith at gmail.com> wrote:
> There's nothing *wrong* with pragmatism, but in any case, we seem to
> agree on this.  As I said earlier, we ought to impose a (rather
> arbitrary) total order on Float and Double, and then offer comparison
> with IEEE semantics as a separate set of functions when they are needed.
> (I wonder if Ocaml-style (<.) and (>.) and such are used anywhere.)

I think the only point of disagreement here is that I'm advocating the
introduction of a partial ordering class (for which floating point
values could be given a proper instance according to IEEE semantics)
rather than treating floats as a special case. I would prefer going a
step further and having two distinct total order classes to
distinguish meaningful total orders from nonsense ones like for Float
and Double, but perhaps that seems excessive to other people.

> It's clear to me that Enum for Float means something coherent.  If
> you're looking for a meaning independent of the instance, I'd argue you
> ought to be surprised if you find one, not the other way around.  Why
> not look for a meaning for Monoid that's independent of the instance?
> There isn't one; instead, there are some rules that the instance is
> expected to satisfy, but there are plenty of types that have many
> possible Monoid instances, and we pick one and leave you to use newtypes
> if you wanted a different one.

I have to disagree here. Monoid has a very clear, narrow,
type-independent meaning: the eponymous algebraic structure. The
minimal definition of the class is a value and a binary operation;
this is a very small interface, and the laws expected of an instance
nearly exhaust the properties of these definitions, either by
specifying behavior (e.g., associativity) or by deliberately not
specifying (is the binary operation commutative? not in general, but
it could be). Simply by satisfying the type signature, any instance is
going to at least vaguely resemble a valid one, and checking the laws
is straightforward.

On the other hand, Enum has conversions to and from Int and a host of
interdefined operations with at best loose guidelines for how they
should behave. Does "toEnum . fromEnum = id" hold? Not in general.
Does "succ . fromEnum = fromEnum . succ" hold? Probably not. I think.
What do enumFrom, enumFromThen, &c. mean? What the instance author
thought made sense, I suppose, since they're only defined as "what
list range syntax desugars to". In the case of types that also have a
Bounded instance there are further requirements, mostly relating to
where runtime errors should be produced (gosh, that helps).

Consider this: How many Enum instances do you think override the
default definitions, not for efficiency, but in ways that give
different results? How many Monoid instances do you think override
mconcat in a way that gives a different answer than "foldr mappend
mempty"?

Here's a thought experiment. Imagine that, instead of Monoid, we had a
type class called "Summarize" used mostly to desugar some sort of
built in summation syntax. The main function used is "summarize ::
(Summarize a) => [a] -> a", the class is described as a generalized
"sum", and the motivating examples are all independent of the order of
elements in the list (because addition is commutative, right). But
nowhere is it specified what the behavior of the class should be,
other than that it desugars the syntax in some way that presumably
makes sense. It's not required that "summarize []" produce an identity
value, it's not required that summarizing repeatedly should be
associative, it's not required that reordering the list give the same
summary, and so on. Most instances do have all these properties of
course, but then someone makes a library with an extremely
non-commutative instance for Summarize and we get a -cafe thread
complaining about it and then I write a very long and tedious message
all about how Summarize is underspecified and has no clear meaning and
probably should be explicitly defined as some sort of monoid, either
commutative or more general.

But I digress.

The ambiguity from Monoid is purely that many types have multiple ways
to fulfill the very precise requirements of the class. The ambiguity
of Enum is that it isn't clear what, if anything, the requirements
even are, and nothing rules out a wide variety of equally valid
instances other than a vague notion of which one "makes sense", a
point on which reasonable people may disagree!

Possibly a better example would be MonadPlus, for which (if memory
serves me) there's some similar ambiguity about the laws an instance
should follow, with inconsistency even in the standard library as to
which interpretation is chosen, and resulting in actual confusion
about what code should do.

> I'm not saying that Enum must be left exactly as is... but I *am* saying
> that the ability to use floating point types in list ranges is important
> enough to save.  For all its faults, at least the current language can
> do that.  When the solution to the corner cases is to remove a pervasive
> and extremely useful feature, I start to get worried!

I have no desire to remove useful features. What I don't like is when
features behave inconsistently in unclear ways between two cases that
I would expect to be equivalent; the more useful the feature is, the
more troubling this becomes. At best this results in generic functions
defined on the class being nearly useless because you have no idea
what they even mean out of context; at worst it creates serious bugs
due to invalid assumptions, as I think is demonstrated by the
(blatantly incorrect) Ord instance for floats causing the illusion of
data loss in standard data structures.

Given that a major purpose of Enum is to translate numeric ranges, the
fact that it can have dramatically different behavior for different
numeric types strikes me as deeply problematic, and an endless source
of bugs in potentia. In fact, I would (and will, should the
opportunity arise) actively advise people new to the language to avoid
the list range syntax when floating point types are involved because
of the pitfalls, or to at least only use it in the [x, y..] form.

> Yes, I could see (somehow in small steps that preserve backward
> compatibility for reasonable periods) building some kind of clearer
> relationship between Ord, Enum, and Ix, possibly separating Enum from a
> new Range class that represents the desugaring of list ranges, or
> whatever... but this idea of "I don't think this expresses a deep
> underlying relationship independent of type, so let's just delete it
> without regard to how useful it is" is very short-sighted.

Having a deep underlying meaning for type classes isn't just for the
sake of elegance; having a well-defined, consistent meaning removes a
great deal of cognitive load in working with code because it narrows
dramatically the context required to know what an expression means.
Writ large this is the principle behind equational reasoning and
parametricity, which are the most powerful concepts available for
reasoning about Haskell code. Type classes with unclear semantics
undermine this, and while an Enum constraint may not be as nefarious
as, say, Typeable would be, it's arguably closer to that than to
something simple and coherent like Monoid or Functor.

Alas, these properties are as fragile as they are useful. Take the
humble and harmless "show" function, for instance. One might
occasionally think that it would be handy to have an ambient
implementation, allowing a value of any type to be converted to a
string, even if only as a dummy value like "<<function>>". But
allowing this without a Show constraint suffices to destroy the
guarantees of parametricity, as surely as does any function with
"unsafe" in its name! A terrible price for such a trifling
convenience.

"Civilization advances by extending the number of operations which we
can perform without thinking about them." Deep underlying meaning has
a deep utility of its own, but only to the extent to which it is kept
absolute.

...and that is, at egregious length, why I find Enum dissatisfying.

- C.