Replacement for GMP: Update
Peter Tanski
p.tanski at gmail.com
Thu Aug 10 20:00:40 EDT 2006
Thorkil,
> I would like to mention a few things that I have not seen
> discussed: Clearly,
> using an existing library unmodified would be preferable: New
> developments,
> error corrections, documentation, wide exposure, all of these
> things would be
> available.
Agreed. Unfortunately for us it seems that, apart from GMP, none of
the fully developed and fast libraries available have all the
functionality Haskell requires: ARPREC (and its predecessor, MPFUN)
lack all binary operators (AND, OR, XOR, left-shift, right-shift, bit-
flip, word-flip); OpenSSL's BN library lacks AND, OR, XOR and the
least common multiple operation (lcm), as do the libraries in Botan
and Crypto++. I did not mention this before, but they also lack
conversion functions to and from floating point representations. So
it would not be possible to use these in GHC and still maintain the
same functionality without writing those transformation functions
either separately (slow) or in the libraries themselves (modification).
There are libraries that have the functionality we require: MIRACL
(at http://indigo.ie/~mscott/) is free for educational use but
requires a premium for commercial use; LibTomMath is essentially beta
(version 0.38), slow (less than half as fast as OpenSSL, which is
equivalent in speed to GMP) and from my experience when putting it
together, a little hacked (after all, it is a beta version); MPI
(from http://www.cs.dartmouth.edu/~sting/mpi/) is a good library and
has all the math functions but lacks the binary operations and has
not been updated since 2003; maybe it has not required updating (it
was originally written in 1998). I have read that MPI is not as fast
as LibTomMath (I will do a real comparison if anyone is interested).
I do know from timing tests I did run in the MPI distribution that
MPI tends to operate better on small integers (128-256 bits).
LibTomMath and MPI are both essentially public domain. I have heard
that LibTomMath is used as the Tcl bignum library; I do not know that
library supports Perl's bignum (Math::BigInt) but I could look into it.
Bignum libraries are actually fairly widely available but few are
portable across a wide variety of domains: Microsoft's .NET bignum
library and Apple's vecLib BigNum library, for example. (Apple's
vecLib vBigNum is also limited since it offers big integers only at
256, 512 and 1024 bits and it does not provide all the mathematical
operations we require.)
You may have noticed from reading the discussions so far that an
internal library *is* a possibility.
> I have looked briefly at the OpenSSL Bignum library and in the
> areas of memory
> management, but also error handling, it seems clearly intertwined
> to some
> extent with OpenSSL in ways which would appear to rule out the
> direct use of
> this library for GHC Integers.
OpenSSL's BN library is primarily tuned to support cryptography,
particularly the generation of very large primes for public key
cryptosystems. It is possible to separate the BN library out (I am
having some success there already). It is also possible to use the
library separately from Haskell using ForeignPtr; essentially doing
everything through Haskell's FFI. I have honestly not benchmarked a
FFI-ForeignPtr interface against the current internal-GMP
implementation, partly because the overhead required to use
ForeignPtr and the availability of garbage-collected memory for GMP
indicate that an internal GHC Bignum library would clearly be
faster. Many people have given good arguments for using an FFI-
interface to such a library (for one, GMP supports dll's and posix
dynamic libraries, which would eliminate the licensing issue), so I
think before I go on I will set up a benchmark to try the two out.
The one thing I cannot do without taking GMP out of the GHC compiler
is a side-to-side comparison of GMP-FFI and GMP-internal because GMP
can use only one memory allocator at a time: either GHC's runtime
system Garbage Collector or malloc.
> However, considering the advantages of using
> an existing library unchanged, we might consider another
> possibility: Working
> with the OpenSSL people to modify their library to allow the sort of
> interfacing that is needed for its direct and efficient use in GHC.
> While, of
> course, retaining its value as part of OpenSSL.
I haven't looked at that for reasons of time--perhaps this is my
personal target and perhaps it is for the benefit of the next GHC
release on 26 August: it may take a long time to work with OpenSSL to
modify their library. It might be worth looking into, just to cover
all bases. The problem for me is that I seem to be doing all the
work, albeit rather slowly; I simply do not have the time to follow
every possibility.
> (And way further back: Have we tried to discuss the LGPL licence of
> GMP with
> the authors? I am not really into all these matters, sorry if this
> doesn't
> make sense.)
That is actually a good idea; it has been suggested before for an
individual developer. I doubt very much the Free Software Foundation
would want to give an exception for one of their flagship products
but I admit, that kind of thinking is, well, cowardly. As for
whether I am the person to ask them, I could float the question but I
could not represent the GHC Team.
> Failing that, I would suggest considering the development of the
> modified
> library to a form that would allow independent use, apart from its
> use in
> GHC. This would add valuable possibilities to your options when
> choosing the
> precise mixture of Haskell and, perhaps, raw C code that best
> balances your
> performance desires and needs for convenience.
Before I actually bind any library with GHC, I will thoroughly test
it as a stand-alone library and I will develop some good benchmarks
for it as an FFI-based library (using ForeignPtr). The Creators of
the GHC, however, have given good arguments why it is Very Good to
use GHC's runtime system Garbage Collector to handle the Integer-
memory for bignum libraries: it is almost certainly faster, it allows
the runtime system's Scheduler to operate on Integers without
resorting to the creation or management of Operating System (OS)
threads and in general the evaluation of Integers is more a "native"
than a "foreign" operation in Haskell programs.
I hope I have answered a few of your points; some of them have given
me more work but that is all right, I guess. You did say that you
frequently use Integers in Haskell. If you have the time, would you
be able to find any programs you might have that displayed poor
Integer performance or possibly bottlenecks in the current system?
It might be helpful for benchmarking the replacement--this is really
supposed to be an "upgrade," after all.
Best regards,
Peter
More information about the Glasgow-haskell-users
mailing list