FiniteMap performance WITHOUT any copying?

George Russell ger@tzi.de
Sat, 08 Feb 2003 18:18:27 +0100


Magnus wrote (snipped)
> Now, Haskell has a garbage collector, so Haskell must know how many
> pointers there are to all objects in the heap (or?). Then, if a finite map
> for example only has one single pointer to it, then it ought to be all
> right to modify the finite map (or whatever datastructure we are
> considering), I mean really modify the map without making any copies, just
> like in imperative languages. Perhaps there might be pointers to nodes
> inside the tree and I guess that could complicate the matter somewhat. But
> for Haskell arrays it ought to be possible to really modify the array if
> it is used by only one pointer ?
> 
> Are such optimizations possible, and if they are, are they already
> implemented in for example GHC ? Or am I wrong somewhere ?

I think it unlikely that such an optimisation is implemented for GHC, which
is as a rule extremely unwilling to modify "functional" data.  The trouble with
this kind of thing is that although it seems neat, it is hard to implement, and
also opens up cans of worms all over the place; for example by making generational
garbage collection harder to implement.

In fact I don't think GHC does know how many pointers there are to all objects in the
heap.  That would only be useful if it did GC by reference-counting, which it doesn't.
Reference counting is in fact rather expensive; as well as requiring an extra field on
every item to hold the number of references, you need to add lots of operations to
increment these fields every time you create or delete a new reference.  Also of course
reference counting will not be able to free circular structures, at least not in general
with data being modified.