Proposal: fix Enum Double instance
Twan van Laarhoven
twanvl at gmail.com
Fri Jul 5 15:50:31 CEST 2013
Hello All,
I would like to make a counterproposal in the Enum Double debate: Instead of
deprecating or removing the instances, how about just fixing them?
A perfect instance for Enum Double is not possible, because arithmetic is
inexact. But you can actually get awfully close. I.e. instead of allowing the
final value to be at most step/2 past the end, we can allow it to be at most
about 2e-16*step past the end. In many practical applications this is close
enough to not be a problem.
Now for some more detail on the design:
* First of all, Haskell's enumFromThenTo is stupid, because calculating
step=then-from destroys numerical accuracy. So I will focus on implementing a
function enumFromStepTo. You can still implement enumFromThenTo on top of it.
But it might also make sense to expose enumFromStepTo.
* It is possible to write an exact instance for Rational. It is IMO unacceptable
that Haskell currently does not use this correct instance. I will use this
Rational instance as a baseline for comparison.
instance EnumFromStepTo Rational where
enumFromStepTo f s t
| s >= 0 && f <= t = f : enumFromStepTo (f + s) s t
| s < 0 && f >= t = f : enumFromStepTo (f + s) s t
| otherwise = []
* Writing a loop naively, by recursing with from' = from+step will result in
accumulating the error of the addition. On the other hand, Doubles can represent
integer numbers exactly up to 2^53, so it is better to use values from+i*step.
* Assume for simplicity that step>0. The stopping condition will then be of the
form from+i*step-to > 0. When this quantity is 0 then the final value should be
included, otherwise it shouldn't be.
* Now for an error analysis. Assume that we start with
f,s,t :: Rational
and call
let f' = fromRational f
let s' = fromRational s
let t' = fromRational t
enumFromThenTo f' s' t'
The fromRational function rounds a rational number to the nearest
representable Double. The relative error in f' is therefore bounded by abs
(f-f') ≤ ε * abs f, where ε is the half the maximum distance between two
adjacent doubles, which is about 1.11e-16. similarly for s' and t'. the result
of an addition or multiplication operation will also be rounded to the nearest
representable Double.
So, the total error in our stopping condition is bounded by
err(f+i*s-t)
≤ ε * abs f -- representing f
+ ε * i * abs s -- representing s, amplified by i
+ ε * i * abs s -- error in calculating (*), i * s
+ ε * abs (f + i*s) -- error in calculating (+), f + (i*s)
+ ε * abs t -- representing t
+ ε * abs (f + i*s - t) -- error in calculating (-)
Note that the last term of the bound is the thing we are bounding. So we can
handle that by picking a slightly larger epsilon, eps = ε/(1-ε).
This leads to the following implementation of enumFromStepTo:
enumFromStepToEps eps !f !s !t = go 0
where
go i
| s >= 0 && x <= t = x : go (i+1)
| s < 0 && x >= t = x : go (i+1)
| abs (x - t) < eps * bound = [x]
| otherwise = []
where
x = f + i * s
bound = abs f + 2 * abs (i * s) + abs t + abs x
with eps = 1.12e-16 :: Double
or eps = 5.97e-8 :: Float
I have done extensive experiments with this implementation and many variants,
and I believe it will work in all cases.
Now for enumFromTo, i.e. step=1 we could make a special case, since we know that
step is exactly equal to 1, so there is no representation error. We could get
rid of the multiplication, and replace i*s by 0 in the error calculation.
Another issue that remains is enumFromThenTo. If we take
step = then - from
then the relative error in step is bounded by
|step' - step| ≤ ε * (|then| + |from| + |then - from|).
This error gets multiplied by i, so it can unfortunately become quite large.
I think it would be best if we use the more general
enumFromStepToEps' !eps !f !s !sErr !t = go 0
where
go i
| s >= 0 && x <= t = x : go (i+1)
| s < 0 && x >= t = x : go (i+1)
| abs (x - t) < eps * bound = [x]
| otherwise = []
where
x = f + i * s
bound = abs f + abs (i * s) + abs (i * sErr) + abs t + abs x
enumFromStepTo f s t = enumFromStepToEps' eps f s s t
enumFromTo f t = enumFromStepToEps' eps f 1 0 t
enumFromThenTo f h t = enumFromStepToEps' eps f (h - f)
(abs h + abs f + abs (h - f)) t
I have also looked at what other language implementations do. So far I have
found that:
* Octave uses a similar method, with the bound
3 * double_epsilon * max (abs x) (abs t),
which is less tight than my bound.
see
http://hg.savannah.gnu.org/hgweb/octave/file/787de2f144d9/liboctave/array/Range.cc#l525
* numpy just uses an array with length ceil((to-from)/step)
see
https://github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/ctors.c#L2742
It therefore suffers from numerical inaccuracies:
$ python
>>> from numpy import arange
>>> notone = 9/7. - 1/7. - 1/7.
>>> notone
1.0000000000000002
>>> len(arange(1,1))
0
>>> len(arange(1,notone))
1
In summary:
* change Enum Rational to the always correct implementation
* change Enum Double and Enum Float to the almost-always correct
implementation propose above.
* (optional) expose the enumFromStepTo function.
Twan
More information about the Libraries
mailing list