GMP unboxed number representation

Marcin 'Qrczak' Kowalczyk qrczak@knm.org.pl
Mon, 7 May 2001 20:54:42 +0200


On Mon, May 07, 2001 at 07:45:59PM +0200, Hans Aberg wrote:

>   typedef struct  {
>     _mp_int _mp_s;     /* Small number representation. */
>     mp_limb_t *_mp_d;  /* Pointer to the limbs and all other data.  */
>   } __mpz_struct;
> where simply _mp_d is set to NULL if one is using the small number
> representation.

Does _mp_s have a meaning when _mp_d is not NULL? Or do you keep the
used size and allocated size under _mp_d? The latter is good that
it's not necessary to copy these sizes into gmp structures separately
but they live inside the ByteArray#.

There are no null ByteArray#s. The GC currently aborts if a null
pointer is found when a Haskell object is expected (C pointers are
tagged like Int# etc., i.e. as nonpointers, and ByteArray#s are tagged
like Haskell objects, even though they are not exactly normal Haskell
objects).

This could be probably changed (unless the benefit of having this
ASSERT for debugging the GC is large; I doubt it). Depending on this
I see two choices for Integer representation in ghc, with the second
looking really well:

1.
    data Integer
        = S# Int#
        | J# ByteArray# -- Size no longer needed here.

Primops have a hard case for returning the appropriate data.
This is ugly:
    case primSomething# args of
        (# 0, s, _ #) -> S# s
        (# _, _, d #) -> J# d
and I'm not sure that using a dummy ByteArray# and checking the
pointer equality would work well.

2. Let the GC ignore null pointers. Introduce nullByteArray# primop.
Mirror the gmp representation exactly and let gmp perform arithmetic
on all numbers:

    data Integer = J# Int# ByteArray#

It's still best if _mp_int has the size of a pointer.


> I am not sure what you mean here; in what context should these
> integer literals be entered?

When the programmer writes for example:

f :: Integer -> Integer
f x = 54321453245328452134 - x

the compiler must generate code which makes a mpz constant of the
right value.

Currently it generates something like this:

lvl = S# 2147483647#
lit = ((((S# 11# `timesInteger` lvl) `plusInteger` S# 1673077741#)
      `timesInteger` lvl) `plusInteger` S# 914624008#)
f x = lit `minusInteger` x

It used to generate something like this:

lit = case addr2Integer# "54321453245328452134"# of (# s, d #) -> J# s d
f = lit `minusInteger` x

-- 
 __("<  Marcin Kowalczyk * qrczak@knm.org.pl http://qrczak.ids.net.pl/
 \__/
  ^^                      SYGNATURA ZASTĘPCZA
QRCZAK