[Haskell] Simple IO Regions

oleg at pobox.com oleg at pobox.com
Tue Jan 17 05:13:04 EST 2006


This message shows a very simple implementation of Monadic Regions
(for the particular case of IO and reading the file). The technique
*statically* guarantees that neither a file handle nor any computation
involving the handle can leak outside of the region that created
it. Therefore, the handle can be safely closed (and its resources
disposed of) whenever control leaves the corresponding 'withFile'
block. Many handles can be open simultaneously, the type system
enforces the proper nesting of their regions. The technique has no
run-time overhead and induces no run-time errors. Unlike the previous
implementation of monadic regions, only the basic extensions
(higher-ranked types and one two-parameter type class) are used. No
undecidable instances, no functional dependencies (let alone
overlapping instances) are required. In fact, the implementation uses
only one trivial typeclass and one trivial instance. 

Perhaps such an approach to File IO can be more widely used? It
trivially generalizes to database IO and other kinds of IO.

The motivation for monadic regions has been best explained by:

Brandon Moore wrote on Haskell Cafe:
> I'm assuming you understand how the type on runST and the STRef
> operations ensure that, even though you *can* smuggle out an STRef in
> the result from runST, you will never be able to use it again.
>
> The idea was to do the equivalent thing with databases: use fancy types
> to ensure that handle can only be used inside to origination withDB or
> withCursor or whatever, and the bracketing function can release the
> resource on the way out, without worrying about it being used again.

Benjamin Franksen wrote:

> I think this is an extremely good idea. I have been very frustrated
> with finalizers because of their limitations (can't rely on them being
> called at all), so have (reluctantly) been using the unsafe bracket
> version. Making it safe via a type system trick is really the way to
> go.


Let us start with the tests

> {-# OPTIONS -fglasgow-exts #-}
>
> module IORegionsTest where
>
> import IORegions -- see below
>
> test0 = withFile "/etc/motd" (const $ return True)
>
> reader q = do
> 	     c1 <- qGetChar q
> 	     c2 <- qGetChar q
> 	     return [c1,c2]
>
> test1 = withFile "/etc/motd" reader
> test1r = runIOM test1 >>= print

Instead of handles, we have Qs -- marked handles. The are created by
the function withFile and used similar to regular handles. A special
IOM monad is a newtype away from the regular IO. The phantom type
parameter of the IOM monad maintains the marks of the regions.

	*IORegionsTest> :t reader
	reader :: (Monad (IOM marks), IORegions.IN mark marks) =>
		  Q mark -> IOM marks [Char]

the type of the reader shows that it takes a marked handle and yields
a marked IO computation. The constraint IN assures that the
computation must be marked with the mark of the handle.

If we attempt to leak the handle:
*> test2 = withFile "/tmp/i.hs" (\q -> return q)

we get
    Inferred type is less polymorphic than expected
      Quantified type variable `mark' escapes
    In the second argument of `withFile', namely `(\ q -> return q)'

The following is OK: we perform the computation and return its result:
> test3 = withFile "/etc/motd" (\q ->  (qGetChar q))

If we attempt to return the unperformed computation itself:
*> test4 = withFile "/tmp/i.hs" (\q ->  return (qGetChar q))

we get
    Could not deduce (IORegions.IN mark marks1)
      from the context (IORegions.IN mark marks)
      arising from use of `qGetChar' at IORegionsTest.h...

As we said earlier, more than one handle can be at play at the same
time:

> reader2 q1 q2 = do
> 	   c1 <- qGetChar q1
> 	   c2 <- qGetChar q2
> 	   return [c1,c2]
> test5 = withFile "/etc/motd" (\q1 -> 
> 			        withFile "/etc/motd" (\q2 -> reader2 q1 q2))
>
> test5r = runIOM test5 >>= print

Incidentally, the inferred type of reader2 is

	*IORegionsTest> :t reader2
	reader2 :: (Monad (IOM marks),
		    IORegions.IN mark1 marks,
		    IORegions.IN mark marks) =>
		   Q mark -> Q mark1 -> IOM marks [Char]

Obviously, the resulting computation is marked with the marks of both
argument handles.

With two handles, we can actually return a handle -- provided we
return an outermost handle from the innermost region (but not the
other way around). For example, the following is wrong

*> test6 = withFile "/etc/motd" 
*> 	    (\q2 -> 
*> 	     do
*> 	     q' <- withFile "/etc/motd" (\q -> return q)
*> 	     qGetChar q')


but the following is OK:
> test7 = withFile "/etc/motd" 
> 	   (\q2 -> 
> 	    do
> 	    q' <- withFile "/etc/motd" (\q -> return q2)
> 	    qGetChar q')

Ditto for the computation:

The following is the improper leakage and leads to a type error:

*> test8 = withFile "/etc/motd" 
*> 	    (\q2 -> 
*> 	     do
*> 	     a <- withFile "/etc/motd" (\q -> return (qGetChar q))
*> 	     a)

But the following is fine:

> test9 = withFile "/etc/motd" 
>          (\q2 -> 
>           do
>           a <- withFile "/etc/motd" (\q -> return (qGetChar q2))
>           a)
>
> test9r = runIOM test9 >>= print


-- The file IORegions.hs follows.

{-# OPTIONS -fglasgow-exts #-}

-- Simple IO Regions

module IORegions (runIOM, qGetChar, withFile, 
		  -- Only types are exported, not their data constructors!
		  Q, IOM) where

import Control.Exception
import System.IO

-- The marked IO monad. The data constructor is not exported.
-- The type 'marks' is purely phantom (and is never instantiated, actually)
newtype IOM marks a = IOM (IO a) deriving Monad
unIOM (IOM x) = x

-- The marked IO handle. The data constructor is not exported.
newtype Q mark = Q Handle

-- |IN mark marks| asserts that |mark| is a member of the mark set |marks|
-- The mark set is really a set, and the best of all, it's typechecker
-- that maintains it. We don't need to do anything at all.
class IN a b
instance IN () b

-- Reading from a marked handle. The mark must be within the marks
-- associated with the IOM monad
qGetChar :: (IN mark marks) => Q mark -> IOM marks Char
qGetChar (Q h) = IOM $ hGetChar h

-- There must not be an operation to close a marked handle!
-- withFile takes care of opening and closing (and disposing) of 
-- handles.

-- Open the file, add the markset constraint for the duration of the body,
-- and make sure the marked handle does not escape.
-- The marked handle is closed on normal or abnormal exit from the
-- body
-- The type system guarantees the strong lexical scoping of
-- withFile. That is, we can assuredly close all the handles after
-- we leave withFile because we are assured that no computations with
-- marked handles can occur after we leave withFile.

withFile :: FilePath -> (forall mark. IN mark marks => Q mark -> IOM marks a)
	 -> IOM marks a
withFile filename proc = 
 IOM(
     bracket
     (openFile filename ReadMode)
     (hClose)
     (\handle -> unIOM $ proc ((Q handle) :: Q ())))

-- Running the IOM monad
runIOM :: (forall mark. IOM mark a) -> IO a
runIOM = unIOM



More information about the Haskell mailing list