[Haskell-cafe] Garbage collecting pointers

Mads Lindstrøm mads_lindstroem at yahoo.dk
Fri Mar 26 18:52:24 EDT 2010


Hi

On Fri, 2010-03-26 at 21:33 +0000, Sebastian Sylvan wrote:

> 
> 
> Reorganizing data on the fly sounds like it may be a pretty sensible
> idea now that cache misses are so bad (in comparison). The fact that
> Haskell data is generally immutable helps too.
> However, I think your scheme sounds a bit complicated for my tastes,
> and I don't really understand the idea of "lengths", what would "n
> consecutive elements" mean for a tree? Surely this wouldn't work just
> for list-like data structures?

First of all, I had assumed that a data type's memory footprint was
independent of its value. Similarly to how a C-union always occupies the
same amount of memory. After reading your post, I reevaluated my
assumption and my assumption might just be wrong.

However, if my assumption is right we could lay out the memory in a
preorder fashion. The following tree:

       a
      / \
     /   \
    b     c
   / \   / \
  d   e f   g

would have the following memory layout: a, b, d, e, c, f, g. A pointer
to element "a" would be of length 6 (a + 6 additional elements). The
preorder layout would also allow us to point to part of the structure.
For example, we could make a pointer to element "b" of length 2 (b + 2
additional elements). In this way, we do not need to brake a larger
chunk into smaller ones to enable sharing of data. Neither would we need
to scan though the hole structure, when only pointing to a part of the
structure.


> Something that seems a bit easier to work with would be to use just
> two data layouts, one which uses pointers everywhere like now, and one
> in which every single field is expanded one level (each of them could
> in turn be expanded one more level, but you don't know that until you
> look at it). If one of the fields can't be expanded (e.g. because it
> would become cyclic) then you just have to keep the "fully
> pointerfied" version.
> 
> 
> It's still a bit tricky because the offset to a specific member would
> be depend on the specific sizes of the previous fields in the layout
> (which can vary depending on the actual value). It would always
> possible to figure it out at runtime, though, by scanning through the
> fields each time finding the next one by incrementing the size, and
> you could just disable this feature for large data types. You could
> potentially store a heavily quantized lookup table (possibly at the
> expense of padding some members to align all members on a large common
> multiple from the base).

Would scanning not run the risk of changing the asymptotic complexity of
programs?

What do "heavily quantized lookup table" mean? It is the "heavily
quantized" part I do not get.


> 
> 
> The general idea seems interesting to me, but then I'm hardly an
> expert. I like the notion of using the fact that the language is free
> to change the data representation at runtime to improve performance by
> "collapsing" an immutable (recursive) data structure once it's been
> created. Partly because it's something that would be harder to do in
> other languages!
> 
> 
> -- 
> Sebastian Sylvan

/Mads
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
Url : http://www.haskell.org/pipermail/haskell-cafe/attachments/20100326/239a9508/attachment.bin


More information about the Haskell-Cafe mailing list