FFI Bindings to Libraries using GMP
Benedikt Huber
benjovi at gmx.net
Fri Sep 14 09:14:58 EDT 2007
> | I've been struggling using FFI bindings to libraries which rely
> on the
> | GNU Mp Bignum library (gmp).
> It's an issue that bites very few users, but it bites them hard.
> It's also tricky, but not impossible, to fix. The combination
> keeps meaning that at GHC HQ we work on things that affect more
> people. I doubt we can spare effort to design and implement a fix
> in the near future -- we keep hoping someone else step up and
> tackle it!
>
> Peter Tanski did exactly that (he's the author of the
> ReplacingGMPNotes above), but he's been very quiet recently. I
> don't know where he is up to. Perhaps someone else would like to
> join in?
>
Thank you for the information - I'm also willing to help, though I'm
not too familiar with the GHC internals (yet).
I do like the idea of optionally linking with a pure-haskell library,
but I'm interested in a solution with comparable performance.
Commenting solutions to ticket #311:
(1) Creating a custom variant of the gmp lib by renaming symbols and
possibly removing unneccessary functionality, as suggest by Simon
Marlow in ticket #311 would be relatively straightforward; I've
already tried this approach the other way round (i.e. recompile
libraries to be used with the FFI). But it means that you'd have to
maintain and ship another library, so I guess it is not an option for
the GHC team.
(2) Using the standard allocation functions for the gmp memory
managment (maybe as compile flag) as suggested in http://
www.haskell.org/pipermail/glasgow-haskell-users/2006-July/010660.html
would also resolve ticket #311.
In this case at least the dynamic part of gmp integers has to be
resized using external allocation functions, and a finalizer
(mpz_clear) has to be called when an Integer is garbage collected.
It seems that the performance loss by using malloc is significant
[1], as lots of allocations and reallocations of very small chunks
occur in a functional setting; some kind of (non garbage collected !)
memory pool allocation would certainly help. I'm not sure what
overhead is associated with calling a finalizer ?
(3) So when replacing GMP with the BN library of OpenSSL (see http://
hackage.haskell.org/trac/ghc/wiki/ReplacingGMPNotes/
PerformanceMeasurements), it would propably be neccessary to refactor
the library, so custom allocation can be used as well. This does not
seem too difficult at a first glance though.
So I'd like to investigate the second or third option, as far as my
knowledge and time permits it.
Of course it would be wise to check first if Peter Tanski is already/
still working on a GMP replacement.
Benedikt
[1]
Simple Performance Test on (ghc-darwin-i386-6.6.1):
The haskell function (k was taken as 10M)
> test k = (iterateT k (fromIntegral (maxBound ::Int))) :: Integer
where
> iterateT 0 v = v; iterateT k v = v `seq` iterateT (k-1) (v+10000)
triggers around k allocations and k reallocations by the gmp library.
The rough C equivalent, calling sequences of
> malloc(3), mpz_init_set(gmp), mpz_add_ui(gmp), mpz_clear(gmp) and
free(3),
takes more than 2 times as long, with 25% of the time spend in
allocating and freeing pointers to gmp integers (mpz_ptr) and 50% of
the time spend in gmp allocator functions (i.e. resizing gmp integers
= (re)allocating limbs).
I also performed the test with the datatype suggested by John
Meacham (using a gmp library with renamed symbols),
> data FInteger = FInteger Int# (!ForeignPtr Mpz)
but it was around 8x slower, maybe due to the ForeignPtr and FFI
overhead, or due to missing optimizations in the code.
More information about the Glasgow-haskell-users
mailing list