Replacement for GMP
Peter Tanski
p.tanski at gmail.com
Tue Aug 1 17:35:06 EDT 2006
Esa,
> What I have written here might not be the most useful guide to
> start with, but maybe it is of help for other interested souls.
Many thanks for the notes; it would probably be better if more than
one programmer worked on it.
> * The memory handling:
> The idea on most bignum libs is that they have c-structure kinda
> like this:
> struct Big {
> word size, used;
> digits* payload;
> bool sign;
> } ;
> Now, the size and used tell how much memory is allocated for
> payload
> and how much of it is used. sign is the sign (plus/minus).
> payload is a pointer to memory that contains Integer decoded.
>
> ... Before we ...
> call math-lib, we put together a temporary structure with correct
> pointers. As for target variable, we have hooked the mathlibs
> memory allocation functions to allocate correctly. Upon returning
> Integer, we just take payload, write sign on correct place and
> return the payload-pointer (possibly adjusted).
>
> In pseudo C
> digits* add(digits* din) {
> Big in, out;
> in.size=getLength(din);
> in.used=getLength(din);
> in.payload=din;
> in.sign=getSign(din);
> math_lib_init(out);
> math_lib_add(out, in);
> writeSign(out.payload, out.sign);
> return out.payload;
> }
Sorry to take more of your time, but what do you mean by allocate
"correctly?"
(This may sound naieve): the in { size, used, payload, sign } are all
parts of the info-table for the payload and the RTS re-initialises
the mathlib on each invocation, right?
In the thread "returning to cost of Integer", John Meacham wrote:
> we could use the standard GMP that comes with the system since
> ForeignPtr will take care of GCing Integers itself.
From current discussion: at present the allocation is done on the GC
heap, when it could be done entirely by the mathlib. The benefit to
letting the mathlib handle memory would be that you could use the
mathlib with another (non-Haskell) part of your program at the same
time (see, e.g., (bug) Ticket #311).
(I am making an educated guess, here.) You probably chose to allocate
GMP's memory on the GC heap because:
(1) call-outs to another program are inherently impure since the type-
system and execution order are not defined by the Haskell Runtime; and,
(2) it was a stable way to ensure that the allocated memory would
remain available to the thunk for lazy evaluation, i.e., so that the
evaluation of the returned Bignum could be postponed indefinitely,
correct? Or could the evaluation itself be postponed until the value
was called for--making operations on Integers and other Bignums lazy?
In other words, it does not seem possible to simply hold a ForeignPtr
to the returned value unless there were a way to release the memory
when it was no longer needed. If you wanted the mathlib to retain
the value on behalf of GHC, you would have to modify the library
itself. In the end you have a specialized version of the library and
a change to the procedure from:
math_lib_init ; ...
return out.payload ;
to:
math_lib_init ;
math_lib_evaluate ;
math_lib_free ;
An easier though less-efficient alternative would be to have GHC copy
the value returned by the mathlib. That would be stable and allow
other systems to use the same mathlib concurrently (assuming the lib
is thread-safe).
The third alternative I suggested previously was to embed the Bignum
processing in GHC itself. I think it would be very difficult to
maintain a solution that was both optimised and portable, at least in
C--. (I may be way-off here; I am simply going by a rudimentary
knowledge of BLAST implementations.)
If I am correct about (2) above, the best conclusion I could draw
from this is that the easiest solution would be to copy the memory on
return from the mathlib.
> There are tricky parts for 64bit-stuff in 32bit systems and some
> floating point decoding uses bad configure-stuff that depends on
> math lib stuff, but mostly it's very boring hundreds of lines of
> C-- (but you can make the job much easier by using preprocessor).
I was reading through the Makefiles and headers in ghc's main include
directory about this. One of the big ToDo's seems to be to correct
the method of configuring this stuff using machdep.h or the
equivalent on a local system, such as the sysctl-headers on Darwin.
For C-- this seems like it would be a bit more difficult than simply
confirming whether (or how) the C implementation conforms to the
current standard through the usual header system.
> This is why APPREC might be hard - you need to know the internal
> representation.
> GHC C-- unfortunately is not really near the C-- spec, it doesn't
> first of all implement it all - but that doesn't matter for this
> task - and then it has some extensions for casting and structure
> reading, I think.
These are really great suggestions. GHC's codes (including the .cmm
files) seem very well commented.
> Fast machine is useful, but tweaking build options
> and having a good book/movies/other machine gets it done. It'd be
> very very useful to have 64-bit platform (in practice 64bit
> linux ia32_64) for testing. I'd say OS X is not a good platform
> to do this devel on, but I might be wrong on that. Linux being
> best tested OS, it is the most safe bet.
Yes, I have a slow machine to work with but that just means I have
more incentive to think carefully before I try something out. The
64bit part is a problem but that should be well-handled if the
configuration mess could be cleaned up. I have 32bit Linux and
Windows systems I can use (both very slow, as well) but if the fix
holds to the standards it should work...
I am not sure yet, but I might make it easier on myself by cleaning
up the configuration; the move was to Cabalize GHC, but I think that
goal was for the Haskell code since the rest would have to remain in
scripts, anyway. Would it be simpler to move to a more configurable
system like Bakefile or Scons for putting a heterogeneous
construction together?
Best regards,
Peter
More information about the Glasgow-haskell-users
mailing list