[Haskell-cafe] memory usage: general questions

Thu Aug 4 02:45:49 UTC 2016

On 4/08/16 1:51 PM, Christopher Howard wrote:
> Please forgive the use of the term "overwritten". Of course, the data
> structures are immutable. However, I was of the understanding that,
> through the use of persistent data structures, it was not necessary for
> the entire data structure to copied, if only part of the structure changes.
>
> Suppose I have a list data structure possessing millions of nodes and
> occupying 100MB of memory. And then suppose that the update function
> "drops" (speaking metaphorically!) two nodes off the front of the
> original list, and then prepends a new node to the front of that. Most
> of the original listed can be reused, since the new list can simply
> point to it after the first node. But what about those two nodes that
> were "dropped" from the list, and never used by anything else in the
> program. My question was: will they immediately be deallocated in memory
> right after the application of the update function?
You can't tell.  It depends on the implementation and the language.
There is a Haskell-like language called Clean in which

     f :: (*x *x -> *x) *[!x!] -> *[!x!]
     f g [! x : [! y : zs !]!] = [g x y : zs]

([!x!] is both value and spine strict; [ _ : _ ] is cons, *t means unique)
would allow the compiler to cannibalise one of the "dropped" nodes
for the new node.  Which if any versions of the Clean compiler actually
do this I can't recall; I believe the 1.x compilers didn't but maybe 2.x 
ones do.

This relies on the "uniqueness types" of Clean.  Haskell is more 
interested in
larger-scale improvements like deforestation.
> Or are they cleaned up later by the GC?
Modulo hand waving, yes.
> And can we be sure that it will be on the next run
> of the GC,
What does that even mean?  When you have "generational" collectors
possibly using different strategies in different generations?  When you
have incremental collectors?  When you have concurrent collectors?
before the collector
>   or could they be hanging around it memory for a while,
> depending on other factors?

Well, when "the" garbage collector "runs" depends on other factors,
so those are not exclusive alternatives.

If you think about traditional reference counting (which is not a silly
thing to do in a Haskell context, because there are some truly
impressive hybrid reference counting GCs these days), lazy decrementing
meant that if a chunk of data became garbage, *part* of it could be
reclaimed promptly, but the contents would only incremently be freed as
a side effect of allocation.

Memory management is a complex business, and depending on exactly
what a language's run-time does is risky.

>