Repair to floating point enumerations?

Wed Oct 15 10:44:03 EDT 2008

On Wed, Oct 15, 2008 at 10:41:25AM +0100, Malcolm Wallace wrote:
> Dear Haskell-Primers (and libraries).
>
> Recently, Phil Wadler has pointed out a weird anomaly in the Haskell'98
> Prelude, regarding numeric enumerations for Floats/Doubles:
>
>     Prelude> [0, 0.3 .. 1.1]
>     [0.0,0.3,0.6,0.899999,1.2]
>
> What is odd is that the last value of the list is much larger than the
> specified termination value.  But this is exactly as specified by the
> Haskell'98 Report.
>
>     http://haskell.org/onlinereport/basic.html#sect6.3.4
>
>     "For Float and Double, the semantics of the enumFrom family is given
>     by the rules for Int above, except that the list terminates when the
>     elements become greater than e3+i/2 for positive increment i, or
>     when they become less than e3+i/2 for negative i.
>
> We have discussed this question (and related ones, such as whether Float
> and Double belong in the Enum class at all) several times before, and I
> do not wish to rehash all of those points again e.g.:
>
>     http://www.cse.unsw.edu.au/~dons/haskell-1990-2000/msg07289.html
>     http://www.haskell.org/pipermail/haskell/2001-October/008218.html
>     http://www.haskell.org/pipermail/haskell/2002-October/010607.html
>
> Phil proposes that, although retaining the instances of Enum for Float
> and Double, we simplify the definitions of the numericEnumFrom family:
>
>   numericEnumFromThenTo   :: (Fractional a, Ord a) => a -> a -> a -> [a]
>   numericEnumFrom         =  iterate (+1)
>   numericEnumFromThen n m =  iterate (+(m-n)) n
>   numericEnumFromTo n m   =  takeWhile (<= m) (numericEnumFrom n)
>   numericEnumFromThenTo n m p = takeWhile (<= p) (numericEnumFromThen n m)
>
> The particular feature of note is that the odd fudge factor of (1/2 *
> the increment) is removed.  The inexact nature of floating point numbers
> would therefore cause a specification like
>
>     [ 0.0, 0.1 .. 0.3 ]
>
> to yield the sequence
>
>     [ 0.0, 0.1, 0.2 ]
>
> that is, to omit the upper bound, because (3 * 0.1) is actually
> represented as 0.30000000000004, strictly greater than 0.3.
>
> Phil argues that this behaviour is more desirable: "the simple fix is
> that the user must add a suitable epsilon to the upper bound.  The key
> word here is *suitable*.  The old definitions included completely
> bizarre and often highly unsuitable choices of epsilon."
>
> This proposal seems to me to improve the consistency of the enumeration
> syntax across both the integral and floating types.  Some users may
> still be surprised, but the surprise will be easier to explain.
>
> I am bringing this question to the attention of all who are interested
> in Haskell Prime, because it seems like a sensible and well-reasoned
> change.  Discussion on whether to adopt this proposal for H' is welcome.
>
> But as maintainer and bug-fixer of the Haskell'98 Report, I have also
> been asked whether we should make this change retrospectively to the
> Haskell'98 language (as a "typo").  Since it involves not merely an
> ordinary library function, but a Prelude function, and moreover a
> function that is used in the desugaring of syntax, it is less clear to
> me whether to alter Haskell'98.
>
> Thoughts?

It sounds like a bad fix to me.  It seems important that the
[0.0,0.1.. x] notation should work correctly in the common cases.  And
the common case really is that the final value is intended as an exact
multiple of the increment.

Why not look for a heuristic that gets the common cases right, rather
than going with an elegant wrong solution? After all, these
enumerations are most often used by people who neither care nor know
how they're implemented, but who most likely would prefer if haskell
worked as well as matlab, python, etc.

One reasonable option would be to actually take into account the
expected roundoff error (which isn't hard to compute for simple sums
like this).

It would also be a good idea to reduce that roundoff error by using
multiplication rather than addition to create the enumeration (which
also allows us to ensure that finite enumerations terminate).
e.g. it's a shame that length [1 .. 1e8 :: Float] fails to terminate.
Admittedly, this is a stupid thing to compute, but it's also stupid to
add many small numbers together in sequence, since it always serves to
increase the roundoff error.  It's true that most people's C code
would be just as naive, but we're writing Haskell, and you're talking
about the Prelude, which should be written intelligently, not using
the simplest code, in such a way that it won't bite programmers.

David