FFI Bindings to Libraries using GMP

Benedikt Huber benjovi at gmx.net
Fri Sep 14 09:14:58 EDT 2007


> | I've been struggling using FFI bindings to libraries which rely  
> on the
> | GNU Mp Bignum library (gmp).
> It's an issue that bites very few users, but it bites them hard.   
> It's also tricky, but not impossible, to fix.  The combination  
> keeps meaning that at GHC HQ we work on things that affect more  
> people. I doubt we can spare effort to design and implement a fix  
> in the near future -- we keep hoping someone else step up and  
> tackle it!
>
> Peter Tanski did exactly that (he's the author of the  
> ReplacingGMPNotes above), but he's been very quiet recently.   I  
> don't know where he is up to.  Perhaps someone else would like to  
> join in?
>
Thank you for the information - I'm also willing to help, though I'm  
not too familiar with the GHC internals (yet).
I do like the idea of optionally linking with a pure-haskell library,  
but I'm interested in a solution with comparable performance.  
Commenting solutions to ticket #311:

(1) Creating a custom variant of the gmp lib by renaming symbols and  
possibly removing unneccessary functionality, as suggest by Simon  
Marlow in ticket #311 would be relatively straightforward; I've  
already tried this approach the other way round (i.e. recompile  
libraries to be used with the FFI). But it means that you'd have to  
maintain and ship another library, so I guess it is not an option for  
the GHC team.

(2) Using the standard allocation functions for the gmp memory  
managment (maybe as compile flag) as suggested in http:// 
www.haskell.org/pipermail/glasgow-haskell-users/2006-July/010660.html  
would also resolve ticket #311.
In this case at least the dynamic part of gmp integers has to be  
resized using external allocation functions, and a finalizer  
(mpz_clear) has to be called when an Integer is garbage collected.
It seems that the performance loss by using malloc is significant  
[1], as lots of allocations and reallocations of very small chunks  
occur in a functional setting; some kind of (non garbage collected !)  
memory pool allocation would certainly help. I'm not sure what  
overhead is associated with calling a finalizer ?

(3) So when replacing GMP with the BN library of OpenSSL (see http:// 
hackage.haskell.org/trac/ghc/wiki/ReplacingGMPNotes/ 
PerformanceMeasurements), it would propably be neccessary to refactor  
the library, so custom allocation can be used as well. This does not  
seem too difficult at a first glance though.

So I'd like to investigate the second or third option, as far as my  
knowledge and time permits it.
Of course it would be wise to check first if Peter Tanski is already/ 
still working on a GMP replacement.

Benedikt


[1]
Simple Performance Test on (ghc-darwin-i386-6.6.1):

The haskell function (k was taken as 10M)
 > test k = (iterateT k (fromIntegral (maxBound ::Int))) :: Integer  
where
 > 	iterateT 0 v = v; iterateT k v = v `seq` iterateT (k-1) (v+10000)
triggers around k allocations and k reallocations by the gmp library.

The rough C equivalent, calling sequences of
 > malloc(3), mpz_init_set(gmp), mpz_add_ui(gmp), mpz_clear(gmp) and  
free(3),
takes more than 2 times as long, with 25% of the time spend in  
allocating and freeing pointers to gmp integers (mpz_ptr) and 50%  of  
the time spend in gmp allocator functions (i.e. resizing gmp integers  
= (re)allocating limbs).

I also performed the test  with the datatype suggested by John  
Meacham  (using a gmp library with renamed symbols),
 > data FInteger = FInteger Int# (!ForeignPtr Mpz)
but it was around 8x slower, maybe due to the ForeignPtr and FFI  
overhead, or due to missing optimizations in the code.


More information about the Glasgow-haskell-users mailing list