[Haskell-cafe] instance Enum Double considered not entirely great?
Casey McCann
cam at uptoisomorphism.net
Wed Sep 21 20:31:00 CEST 2011
On Tue, Sep 20, 2011 at 11:33 PM, <roconnor at theorem.ca> wrote:
> For what it's worth, at some point in time I was sketching a proposal to
> split the Enum class into two classes because I felt that two distinct ideas
> were being conflated. Unfortunately this was years ago and I have forgotten
> what the details I was thinking. Perhaps someone can reconstruct a proposal
> along these lines.
Considering the desugaring of list range syntax in general, rather
than the Enum class as such, I would argue for *three* ideas, which
are all supported with varying degrees of success by the current
implementation:
1) Exhaustive enumeration of a finite range, where the desired meaning
of [a..z] is almost exactly that of "Data.Ix.range (a, z)".
2) Iterative generation of a sequence, where the desired meaning of
[a, b..z] is iterating a function implicitly defined by the offset
between a and b, plus an optional takeWhile using some predicate
determined by z. The nature of the offset, predicate, &c. would be
defined on a per-type basis, possibly including a default offset for
when b isn't specified, but personally I'd rather just disallow that
in this case.
3) Evenly-spaced divisions of an infinite range, where the desired
meaning of [a,b..z] assumes that the distance from a to b evenly
divides the distance from a to z, and the result is a list containing
(1 + (z-a)/(b-a)) elements such that all differences between
successive elements are equal, with a and z present as the first and
last elements.
For most types other than fractional numbers and floats, the third
interpretation isn't well-defined and the first coincides both with an
Ix instance (if one exists) and with the second interpretation using
the smallest nonzero offset. Note that the first interpretation does
not require a total ordering, and in fact the Ord constraint on Ix is
somewhat misleading and nonsensical. As such, the first interpretation
naturally extends to more general ranges than what the second can
describe.
For rationals, floats, approximations of the reals, or any other type
with a conceptually infinite number of values in a range, the first
interpretation isn't well-defined, and the second and third
interpretations should coincide when all three parameters are equal,
ignoring rounding errors and inexact representations.
The current Enum class attempts to be something like an ill-defined
mixture of all three, and where the interpretations don't coincide,
the disagreement between them is a likely source of bugs. Worse still,
the instance for floating point values mixes the naively expected
results of both the second and third in a very counterintuitive way:
the "enum to" value at the end behaves neither as an upper bound (the
sequence may exceed it in an effort to avoid rounding errors) nor as a
final element (it may not be in the sequence at all, even if it has an
exact floating point representation). This seems needlessly confusing
to me and is arguably broken no matter which way you slice it.
My thoughts are that the first interpretation is most naturally suited
to list range syntax, that the second would be better served by a
slightly different syntax to make the predicate more explicit, and
that the third bugs the crap out of me because it's really very useful
but I can't think of a concise and unambiguous syntax for it.
- C.
More information about the Haskell-Cafe
mailing list