[Haskell-cafe] Re: Garbage collecting pointers
Simon Marlow
marlowsd at gmail.com
Mon Mar 29 12:00:03 EDT 2010
On 26/03/2010 20:28, Mads Lindstrøm wrote:
> Hi
>
> For some time I have been thinking about an idea, which could limit
> Haskell's memory footprint. I don't know if the idea is crazy or clever,
> but I would love to hear peoples thoughts about it. The short story is,
> I propose that the garbage collector should not just reclaim unused
> memory, it should also diminish the need for pointers, by replacing
> nested data structures with larger chunks of consecutive memory. In
> other words, I would diminish the need for pointers for arbitrary
> recursive data types, just as replacing linked lists with arrays
> eliminate the need for pointers.
>
> I will explain my idea by an example of a data type we all know and
> love:
>
> data List a = Cons a (List a) | Nil
>
> each Cons cell uses two pointers - one for the element and one for the
> rest of the list. If we could somehow merge the element and the rest of
> the list into consecutive memory, we would be able to eliminate the
> pointer to the rest of list. On 64 bit architectures merging would save
> us 8 bytes of "wasted" memory. If we could merge n elements together we
> could save n*8 bytes of memory.
The trouble with techniques like this is that they break the uniformity
of the representation, and complexity leaks all over the place. Various
similar ideas have been tried in the past, though not with Haskell as
far as I'm aware: CDR-coding and BiBOP spring to mind.
> 64 bit pointers are wasteful in another way, as nobody has anywhere near
> 2^64 bytes of memory. And nobody is going to have that much memory for
> decades to come. Thus, we could use the 8 most significant bits for
> something besides pointing to memory, without loosing any significant
> ability to address our memories. This would be similar to how GHC uses
> some of the least significant bits for pointer tagging.
Unfortunatley you don't know whereabouts in your address space your
memory is located: the OS could do randomised allocation and give you a
pages all over the place, so in fact you might need all 64 bits. Yes
you can start doing OS-specific things, but that always leads to pain
later (I know, I've been there).
In the Java community there has been experimentation with using
different representations to avoid the 64-bit pointer overhead: e.g.
shifting a 32-bit pointer to the left by 2 bits in order to access 16GB
of memory. Personally I'm not keen on doing this kind of trick, mainly
because in GHC it would be a compile-time switch; Java has it easy here
because they can make the choice at runtime. Also we already use the
low 2 bits of pointers in GHC for pointer tagging.
Cheers,
Simon
More information about the Haskell-Cafe
mailing list