Repair to floating point enumerations?
Christopher Lane Hinson
lane at downstairspeople.org
Wed Oct 15 17:23:43 EDT 2008
I agree with David, we should be using multiplication, not addition.
However, I think that under the law of least surprise, we should
require that for all a,b,z:
all (\x -> x >= a && x < z || x <= a && x > z) [a,b..z].
For example, anything in the neighborhood of this is just unfair, even
if it's within David's fudge factor:
Prelude> map (\x -> 1 / (x-0.6)) [0,0.1..0.55]
[-1.6666666666666667,-2.0,-2.5,-3.333333333333334,-5.000000000000001,-10.000000000000002,Infinity]
If I want to include the terminating value, then what I really want is
probably some f such that:
f (6 :: Int) 0 0.55 = [0,0.11,0.22,0.33,0.44,0.55]
--Lane
On Wed, 15 Oct 2008, David Roundy wrote:
> Here's a counter-proposal:
>
> numericEnumFromThenTo :: RealFloat a => a -> a -> a -> [a]
> numericEnumFrom 0 = map fromInteger [1..]
> numericEnumFrom n = map ((n+).fromInteger) [1..]
> numericEnumFromThen n m = map (\x -> n+fromInteger x*(m-n)) [1..]
> numericEnumFromTo n m = takeWhile (<= m*(1 + epsilon)) (numericEnumFrom n)
> numericEnumFromThenTo n m p = takeWhile (<= p*(1 + 2*epsilon)) (numericEnumFromThen n m)
>
> epsilon :: RealFloat a => a
> epsilon = 1/2^(floatDigits (undefined :: a))
>
> This uses quite a reasonable approximation for the roundoff error, and
> has the advantage of not inappropriately returning _|_. It does
> sometimes create duplicate entries in the list, but I think that is
> better that returning an infinite list of duplicate entries as the
> code proposed below does.
>
> And yes, the fuzzy comparison is a bit ugly, but at least it means
> that every user is not forced to implement fuzzy comparison in their
> quick and dirty code (which is the only thing this syntax is good
> for).
>
> David
>
> On Wed, Oct 15, 2008 at 03:55:09PM +0100, Lennart Augustsson wrote:
>> I'm sorry, but people who write [0.0,0.1 .. x] where x is a multiple
>> of 0.1 get exactly what they deserve if x is not included. Floating
>> point numbers are not the real numbers, and the sooner they learn that
>> the better. We can fudge this all we like, but 0.1 is never going to
>> be exactly representable as a binary floating point number no matter
>> what we do.
>>
>> On Wed, Oct 15, 2008 at 3:44 PM, David Roundy <droundy at darcs.net> wrote:
>>> On Wed, Oct 15, 2008 at 10:41:25AM +0100, Malcolm Wallace wrote:
>>>> Dear Haskell-Primers (and libraries).
>>>>
>>>> Recently, Phil Wadler has pointed out a weird anomaly in the Haskell'98
>>>> Prelude, regarding numeric enumerations for Floats/Doubles:
>>>>
>>>> Prelude> [0, 0.3 .. 1.1]
>>>> [0.0,0.3,0.6,0.899999,1.2]
>>>>
>>>> What is odd is that the last value of the list is much larger than the
>>>> specified termination value. But this is exactly as specified by the
>>>> Haskell'98 Report.
>>>>
>>>> http://haskell.org/onlinereport/basic.html#sect6.3.4
>>>>
>>>> "For Float and Double, the semantics of the enumFrom family is given
>>>> by the rules for Int above, except that the list terminates when the
>>>> elements become greater than e3+i/2 for positive increment i, or
>>>> when they become less than e3+i/2 for negative i.
>>>>
>>>> We have discussed this question (and related ones, such as whether Float
>>>> and Double belong in the Enum class at all) several times before, and I
>>>> do not wish to rehash all of those points again e.g.:
>>>>
>>>> http://www.cse.unsw.edu.au/~dons/haskell-1990-2000/msg07289.html
>>>> http://www.haskell.org/pipermail/haskell/2001-October/008218.html
>>>> http://www.haskell.org/pipermail/haskell/2002-October/010607.html
>>>>
>>>> Phil proposes that, although retaining the instances of Enum for Float
>>>> and Double, we simplify the definitions of the numericEnumFrom family:
>>>>
>>>> numericEnumFromThenTo :: (Fractional a, Ord a) => a -> a -> a -> [a]
>>>> numericEnumFrom = iterate (+1)
>>>> numericEnumFromThen n m = iterate (+(m-n)) n
>>>> numericEnumFromTo n m = takeWhile (<= m) (numericEnumFrom n)
>>>> numericEnumFromThenTo n m p = takeWhile (<= p) (numericEnumFromThen n m)
>>>>
>>>> The particular feature of note is that the odd fudge factor of (1/2 *
>>>> the increment) is removed. The inexact nature of floating point numbers
>>>> would therefore cause a specification like
>>>>
>>>> [ 0.0, 0.1 .. 0.3 ]
>>>>
>>>> to yield the sequence
>>>>
>>>> [ 0.0, 0.1, 0.2 ]
>>>>
>>>> that is, to omit the upper bound, because (3 * 0.1) is actually
>>>> represented as 0.30000000000004, strictly greater than 0.3.
>>>>
>>>> Phil argues that this behaviour is more desirable: "the simple fix is
>>>> that the user must add a suitable epsilon to the upper bound. The key
>>>> word here is *suitable*. The old definitions included completely
>>>> bizarre and often highly unsuitable choices of epsilon."
>>>>
>>>> This proposal seems to me to improve the consistency of the enumeration
>>>> syntax across both the integral and floating types. Some users may
>>>> still be surprised, but the surprise will be easier to explain.
>>>>
>>>> I am bringing this question to the attention of all who are interested
>>>> in Haskell Prime, because it seems like a sensible and well-reasoned
>>>> change. Discussion on whether to adopt this proposal for H' is welcome.
>>>>
>>>> But as maintainer and bug-fixer of the Haskell'98 Report, I have also
>>>> been asked whether we should make this change retrospectively to the
>>>> Haskell'98 language (as a "typo"). Since it involves not merely an
>>>> ordinary library function, but a Prelude function, and moreover a
>>>> function that is used in the desugaring of syntax, it is less clear to
>>>> me whether to alter Haskell'98.
>>>>
>>>> Thoughts?
>>>
>>> It sounds like a bad fix to me. It seems important that the
>>> [0.0,0.1.. x] notation should work correctly in the common cases. And
>>> the common case really is that the final value is intended as an exact
>>> multiple of the increment.
>>>
>>> Why not look for a heuristic that gets the common cases right, rather
>>> than going with an elegant wrong solution? After all, these
>>> enumerations are most often used by people who neither care nor know
>>> how they're implemented, but who most likely would prefer if haskell
>>> worked as well as matlab, python, etc.
>>>
>>> One reasonable option would be to actually take into account the
>>> expected roundoff error (which isn't hard to compute for simple sums
>>> like this).
>>>
>>> It would also be a good idea to reduce that roundoff error by using
>>> multiplication rather than addition to create the enumeration (which
>>> also allows us to ensure that finite enumerations terminate).
>>> e.g. it's a shame that length [1 .. 1e8 :: Float] fails to terminate.
>>> Admittedly, this is a stupid thing to compute, but it's also stupid to
>>> add many small numbers together in sequence, since it always serves to
>>> increase the roundoff error. It's true that most people's C code
>>> would be just as naive, but we're writing Haskell, and you're talking
>>> about the Prelude, which should be written intelligently, not using
>>> the simplest code, in such a way that it won't bite programmers.
>>>
>>> David
>>> _______________________________________________
>>> Libraries mailing list
>>> Libraries at haskell.org
>>> http://www.haskell.org/mailman/listinfo/libraries
>>>
>> _______________________________________________
>> Libraries mailing list
>> Libraries at haskell.org
>> http://www.haskell.org/mailman/listinfo/libraries
> _______________________________________________
> Libraries mailing list
> Libraries at haskell.org
> http://www.haskell.org/mailman/listinfo/libraries
>
More information about the Haskell-prime
mailing list