Alternatives to finalization

Alastair Reid
Mon, 10 Mar 2003 11:30:01 +0000

Nick Name <> contemplates two ways of finalizing
external resources:

> [lots of context deleted]
> f :: IO [a]
> f = do 
>       allocateResource 
>       l <- makeTheStream 
>       addFinalizer l (disallocateResource) 
>       return l
> [snip]
> Another alternative is to make f return an esplicit "close stream"
> action:
> f :: IO ([a],IO ())
> Is anyone willing to explain me other alternatives if there are, or
> to tell me that there aren't?

The second form (explicit release of resources) is better when:

1) It is important to release resources promptly (because they are
   scarce or expensive).

2) It is easy to identify the last use of the resource and to call
   the finalizer explicitly.

   This usually requires that your code be strict and that you are in
   the IO monad.

3) If there is any chance that you could release the resource too early,
   you have a good way of detecting the problem and propagating an
   appropriate error value or exception.

   Detection is usually easy to arrange using a Haskell proxy which
   records whether the finalizer has been called.
   Reporting the problem can usually be done using the Maybe type,
   Haskell 98 style IOErrors or Asynchronous Exceptions.

The first form is better when any of these do not hold.  The best
example where automatic finalization is appropriate is the
hGetContents function (which your function 'f' seems to resemble).
Laziness makes it hard to predict the lifetime of the stream.  The
code consuming the stream tends to be pure so there's no obvious place
to call the finalizer from.  And, finally, the code consuming the
stream is unlikely to respond well to an exception being raised
because, more than likely, it was written for use in a context where
exceptions meant complete failure not failure of an IO operation.

Nick name says that a problem with the first is:
> The problem is that if no memory is allocated, no garbage collection
> happens; of course finalization is not guaranteed, as the manual
> states.

Haskell code tends to consume memory at a fairly constant rate so, as
long as your program is not blocked waiting for input, you should be
consuming memory.  You then need to tweak the configuration of the
garbage collector (+RTS -h... ...-RTS in GHC) to make the GC trigger
at the desired frequency.

You should also use the garbageCollect function (part of the FFI
specification) to let you explicitly invoke the garbage collector.
You might call this immediately before any blocking IO operations.

Calling the GC too often can be expensive.  In an image processing
system at Yale, we built a tiny 'model' of the resource usage to try
to call the GC only when it might plausibly release resources (or when
we were very short of resources).  The idea was to track how many
resources were allocated and to try to estimate (based on past
behaviour) how many of those are likely to be released if we were to
call GC now.  We'd then only call the GC if the overhead (ratio of
collectable resources to required resources) exceeded some threshold.

Hope this is of some help

Alastair Reid         
Reid Consulting (UK) Limited