ANN: HIntegerByInt
Isaac Dupree
isaacdupree at charter.net
Fri Aug 10 11:56:25 EDT 2007
LICENSE: BSD3 (or also similar, like GHC, etc.)
INITIAL AUTHOR: Isaac Dupree <id at isaac.cedarswampstudios.org>
What is this? It is a reimplementation, in Haskell, of the
Integer type that Haskell provides in its Prelude. It is designed
in mind of being actually usable as the implementation of that type,
for compilers. It is also a module that exports a working
Integer type. It is in terms of only basic Prelude functions, [],
and Int. It is NOT a purely inductive definition, because Int is
much faster than a purely inductive definition would allow, and
nevertheless often easier to come by (more portable, license-wise,
size-wise, nuisance-wise...) than GMP or other C bignum libraries.
/=========
darcs repo: http://isaac.cedarswampstudios.org/2007/HIntegerByInt
I'm rather proud of it in a silly way: I have run into bugs in *all*
major Haskell implementations I tried it with, while testing it (see
COMPILERS section). Reminds me somewhat of GMP triggering compiler bugs
actually:)
All the implementation and documentation is currently in one file
"IntegerInTermsOfInt.hs", except for three auxiliary modules for
testing, where CheckIntegerInTermsOfInt is a Main module that uses
QuickCheck. I have copied some of that documentation here.
From here, I offer it to anyone who is interested it using it and will
try to help as needed, but I'm not likely to search out compiler
internals myself just to try it in them. I didn't give it a version
number or Cabal file because it is completely useless on its own, as a
library, in view of Prelude already providing a very good Integer.
\==========
SPEED:
It is not too slow on small numbers (smallish constant - much
larger than for Int of course), not too slow on medium-size
numbers (which I've been testing it with), and not too slow on
large numbers (asymptotically; karatsuba multiplication,
O(n^1.585) is used to split up large numbers, and division by
large numbers uses multiplication and Newton's method).
Also see BUGS for the speed of 'show'.
CORRECTNESS:
It seems to be correct, after a fairly thorough million-iteration
QuickCheck in GHC plus a lot of quickcheck testing using
debugging-"Int"s that tell you when they overflow and have
(minBound,maxBound)=(-31,31). Each of these caught an incredible
number of bugs, which is why I am inclined to trust them.
Unfortunately, most Haskell implementations are somewhat incorrect
- see COMPILERS. Also see CAVEATS(c) for *very* large numbers.
CAVEATS:
a- It is obviously much slower than GMP. (although I don't know if
the penalties for calling primitive/foreign functions
counterbalance that for small numbers.)
b- It assumes that Int operations are fairly fast, although it
doesn't tend to waste them that much (e.g. it uses `quotRem`
when it needs both, which is almost always).
c- It is expected to break when handling values with magnitude
greater than around ( (maxBound::Int)^(maxBound::Int) ).
(just like GMP. Probably the assumption is that you'll run
out of memory at the same time, or that the operations will
take SO LONG, so no-one cares.
Prelude.length will (relatedly) also break at this point.)
USAGE AS NATIVE INTEGER:
Completely untried so far (search file for INTERESTING to
customize).
The algorithms are quite separable from the newtype Integer and
its instances, and Bits, Ix... parts can be separated out too
(in which case a proper export list would have to be made for the
algorithm functions, which might hinder optimization a little,
if assuming separate compilation...).
See <http://www.haskell.org/ghc/docs/
latest/html/libraries/base/Prelude.html#t%3AInteger>
for all things GHC's Prelude.Integer be an instance of, including
Typeable? Data? NFData? PrintfArg? Typeable? Random?
This is also a purpose CPP could be useful for, to define (#define?)
quotRemDInt and such type-specific things, conditionally on how
this Integer-implementation is being used.
The internal format is (currently) a list([]) of Ints
in base "intLargeBase", least-significant "digit" first
(negative x is represented by negating all elements of
the list that represents positive x)
(No most-significant zeroes are allowed
(so zero is represented by the empty list, for example))
("intLargeBase" is customizable, although there is an optimal
value for any particular size of Int, and a limit based on the
Int's size)
TODO/BUGS:
COMPILERS.
a- There is one WORKAROUND so Hugs can compile it and two for
nhc/yhc. Even so, I refused to keep a third workaround for
N/Yhc, but that is for a nyhc bug that Neil thinks should be
reasonably fixable.
b- though extensively QuickChecked in GHC, Hugs occasionally fails
QuickCheck, but when the particular example is run in Hugs,
it gives the correct answer! I think Hugs is buggy.
(Hugs Version September 2006)
c- ghc -O -fvia-C may miscompile "quotRem x (some power of two)".
(ghc trac#1603)
If QuickCheck fails badly on you, try adding -fasm.
d- Beware testing in interpreters - some (at least GHCi 6.6.1)
will default to Prelude.Integer even if the module has
"default ()".
Make show faster than O(n^2), e.g. see
http://darcs.haskell.org/packages/base/GHC/Num.lhs ?
...but converting that, it seemed altogether slower than the
current
[and there is more in IntegerInTermsOfInt.hs]
Isaac
More information about the Libraries
mailing list