[Haskell-cafe] Clarifying a mis-understanding about regions (and iteratees)

Thu Feb 23 08:17:28 CET 2012

I have just come across the reddit discussion about regions and
iteratees. 
http://www.reddit.com/r/haskell/comments/orh4f/combining_regions_and_iteratees/

There is an unfortunate misunderstanding about regions which I'd like to
correct. I'm posting in Cafe in hope that some of the participants of
that discussion read Haskell-Cafe (I'm not a redditor).

The user Faucelme wrote:
> Would it be possible to implement a region-based iteratee which opened some
> file "A" and wrote to it, then opened some file "B" and wrote to it, while
> ensuring that file A is closed before file B is opened?

To which the user tibbe replied
> You can typically explicitly close the files as well.

and the user dobryak commented

> Regions that Oleg refers to started out with region-based memory allocation,
> which is effectively a stack allocation strategy, in which resource
> deallocation is in LIFO order. So I think your assumption is correct.

Regretfully, these comments are incorrect. First of all, memory
regions were invented to get around the stack allocation, LIFO
strategy. If the stack allocation sufficed, why do we need heap?  We
have heap specifically because the memory allocation patterns are
complex: a function may allocate memory that outlives it.  Regions
let the programmer create arbitrarily many nested regions; everything
in a parent region is available to a child. Crucially, a child may
request any of its ancestors to allocate memory in their
regions. Therefore, although regions are nested, memory does not have
to be allocated and freed in LIFO order.

The Lightweight monadic regions implement all these patterns for files
or other resources (plus the dynamic bequest).
	http://okmij.org/ftp/Computation/resource-aware-prog/region-io.pdf

The running example of the Haskell Symposium 08 paper was the
following (see sec 1.1)

1. open two files for reading, one of them a configuration file;
2. read the name of an output file (such as the log file) from the
   configuration file;
3. open the output file and zip the contents of both input files into
   the output file;
4. close the configuration file;
5. copy the rest, if any, of the other input file to the output file.

As you can see, the pattern of opening and closing is non-LIFO: the
output file has to be opened after the configuration file and is
closed also after the configuration file. Therefore, the user Faucelme
can find the solution to his question in the code accompanying the
Monadic Regions paper.

Section 5 of the paper describes even more complex example:

1. open a configuration file;
2. read the names of two log files from the configuration file;
3. open the two log files and read a dated entry from each;
4. close the configuration file and the newer log file;
5. continue processing the older log file;
6. close the older log file.

where the pattern of opening and closing is not statically known:
it is decided on values read from the files.

So, Faucelme's question can be answered in affirmative using the
existing RegionIO library (which, as has been shown, well integrates
with Iteratees). There is already a region library with the desired
functionality.