Pointer-or-Int 63-bit representations for Integer
ghc-devs at chrisdone.com
Mon Mar 8 17:13:45 UTC 2021
In OCaml's implementation, they use a well known 63-bit representation of ints to distinguish whether a given machine word is either a pointer or to be interpreted as an integer.
I was wondering whether anyone had considered the performance benefits of doing this for the venerable Integer type in Haskell? I.e. if the Int fits in 63-bits, just shift it and do regular arithmetic. If the result ever exceeds 63-bits, allocate a GMP/integer-simple integer and return a pointer to it. This way, for most applications--in my experience--integers don't really ever exceed 64-bit, so you would (possibly) pay a smaller cost than the pointer chasing involved in bignum arithmetic. Assumption: it's cheaper to do more CPU instructions than to allocate or wait for mainline memory.
This would need assistance from the GC to be able to recognize said bit flag.
As I understand the current implementation of integer-gimp, they also try to use an Int64 where possible using a constructor (https://hackage.haskell.org/package/integer-gmp-126.96.36.199/docs/src/GHC.Integer.Type.html#Integer), but I believe that the compiled code will still pointer chase through the constructor. Simple addition or subtraction, for example, is 24 times slower in Integer than in Int for 1000000 iterations:
An unboxed sum might be an improvement? e.g. (# Int# | ByteArray# #) -- would this "kind of" approximate the approach described? I don't have a good intuition of what the memory layout would be like.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the ghc-devs