[Haskell-cafe] Monadic tunnelling: the art of threading one monad through another

Wed Jul 11 19:49:26 EDT 2007

What you're actually showing is that these effects can be -embedded- in
IO (i.e. that IO already supports them).  I noticed you didn't try to
make an instance for the Cont monad.  Actually, if we added
continuations to IO, we'd be set.  We wouldn't even need your typeclass.

On Wed, 2007-07-11 at 22:41 +0100, Jules Bean wrote:
> A programming style which is encouraged in haskell is to write your
> program using a special monad, rather than the IO monad. Normally some
> mixture of reader and state, possibly writer, the point of these
> monads is to help the type system to help you; help you be specific
> about which parts of your state can be modified by which parts of your
> program.
> 
> One problem with this approach comes with what an imperative programmer
> would call 'callbacks'. It's quite common with C library interfaces to
> take as a parameter an IO action (a callback) which the library will
> call with certain parameters. It's even more common with haskell
> libraries, although the type then is normally more general than IO;
> any higher order function can be thought of as using callbacks.
> 
> Now, supposing we have a library function which takes a callback. As
> an example, suppose we have a library function which reads in a file
> and calls our callback once on each line. A typical type might look
> like:
> 
> forEachLine :: Handle -> (String -> IO ()) -> IO ()
> 
> We have to provide a callback in the IO monad. But what if we don't
> want to work in the IO monad? What if we are working in, for example,
> a custom state monad of our own?
> 
> More generally, is there any sensible way to pass callbacks in one
> monad, to an action which runs in a different monad?  What does it
> mean to want to do this? In general the 'outer' action can call the
> callback as many times as it once, something like this (monospace font
> needed!):
> 
> m:             <- cb ->           <- cb ->
>                 |      |           |      |
> n: -- outer --->      <-- outer -->      <-- outer ---
> 
> Now, to embed 'm' into 'n' in this way, we have to somehow 'freeze'
> the 'monadiness' (you may prefer to use the term 'the warmth and
> fuzziness') of m at the beginning, run the first callback in this
> correct environment, then 'freeze' it again just after the callback
> runs, and restore this for the second 'cb'. Finally we have to somehow
> ensure this 'monadiness' is returned to the caller.
> 
> This procedure is not possible for every pair of Monads m and n. They
> need to have special properties. It is however possible (for example)
> when 'n' is IO, and when 'm' is one of several common custom monads,
> as I will now show.
> 
> Boring imports give a clue as to which monads I'm going to attempt.
> 
>  > {-# OPTIONS -fglasgow-exts #-}
> 
> Although I have -fglasgow-exts on here, it's actually only necessary
> for the MPTC tricks towards the end of the exposition. The initial
> discussion is haskell 98 as far as I know.
> 
>  >
>  > import Control.Monad.State
>  > import Control.Monad.Reader
>  > import Control.Monad.Writer
>  > import Control.Monad.Error
>  > import Control.Exception
>  >
>  > import Data.Typeable
>  >
>  > import Data.IORef
> 
> We begin with a callback which is not in the IO monad, but in StateT
> Int IO (). StateT monads are very common in real programs (or
> hand-rolled equivalents). The simple callback just prints a message so
> we know it has happened, and increments the counter in the state so we
> can prove state is being correctly threaded.
> 
>  > -- example small action in the custom monad
>  > stateioact :: StateT Int IO ()
>  > stateioact = do
>  >   x <- get
>  >   liftIO $ putStrLn ("stateioact called, with state as "++
>  >                      show x)
>  >   put (x+1)
>  >
> 
> The main action is just a boostrap to call a 'real' main action with
> some initial state:
> 
>  > -- Main action has type IO () as standard
>  > -- This example main action just defers to another action
>  > -- which is written in a custom monad
>  >
>  > main :: IO ()
>  > main =  do
>  >   putStrLn "main starting"
>  >   evalStateT mainAction 42
>  >   putStrLn "main exiting"
>  >
> 
> The real main action is a StateT Int IO () action. It calls stateioact
> once, then calls a "library function" which is known to use callbacks,
> then it calls stateioact again.
> 
>  > -- mainAction is written in the custom monad
>  > mainAction :: StateT Int IO ()
>  > mainAction = do stateioact
>  >                 embedIO $ \x -> usesCB (makeCallback stateioact x)
>  >                 stateioact
>  >
> 
> The library function is "usesCB". makeCallback and embedIO are the
> keys which make it all work.
> 
> Here is the library function. Note that it has plain type IO () -> IO
> (), just like any library function (e.g. a C FFI function) which takes
> a single callback. It prints diagnostic messages and takes care to
> call its callback twice, to demonstrate the monad-threading.
> 
>  > -- a 'library function' in the IO monad, which has a callback as one
>  > -- of its parameters. The library function has no knowledge of the
>  > -- custom monad being used by the main action here
>  > usesCB :: IO () -> IO ()
>  > usesCB f = do putStrLn "usesCB starting"
>  >               f
>  >               putStrLn "usesCB middle"
>  >               f
>  >               putStrLn "usesCB exiting"
>  >
> 
> The 'threading' of the monad internals is handled by the dual
> functions, embedIO and makeCallback. embedIO turns an IO action
> (usesCB in the example above) and makes it into a StateT action. In
> that sense, it is doing the same job as 'liftIO'. However, it also
> encapsulates the internals of the StateT monad into a package, from
> where they can be retrieved by the callback later:
> 
>  > -- the glue which makes it work:
>  > embedIO :: (IORef s -> IO a) -> StateT s IO a
>  > embedIO a = do s <- get
>  >                x <-  liftIO $ newIORef s
>  >                r <-  liftIO $ a x
>  >                s' <- liftIO $ readIORef x
>  >                put s'
>  >                return r
>  >
> 
> (You can infer a much more general type for embedIO, but it's not more
> useful in most cases.) In fact what embedIO does, in this particular
> case, is store the state of the StateT monad in an IORef. Then
> makeCallback arranges for that to be accessible to the callbacks:
> 
>  > makeCallback :: StateT s IO a -> IORef s -> IO a
>  > makeCallback act x = do s <- readIORef x
>  >                         (res,s') <- runStateT act s
>  >                         writeIORef x s'
>  >                         return res
>  >
> 
> So, in the critical call
> 
> embedIO $ \x -> usesCB (makeCallback stateioact x)
> 
> the 'trick' is that embed IO makes a new IORef to store the state,
> populates it with the current state, binds it to that \x, and
> makeCallback takes the IORef and forms a new IO callback 'around'
> stateioact which contains the plumbing to read from, and write to,
> that IORef.
> 
> Or in other words, we use the IO monad's ability to store unlimited
> state, stealing away a little private (lexically bound) portion for
> the use of our callback.
> 
> And, if you're skeptical, it does work:
> 
> *Main> main
> main starting
> stateioact called, with state as 42
> usesCB starting
> stateioact called, with state as 43
> usesCB middle
> stateioact called, with state as 44
> usesCB exiting
> stateioact called, with state as 45
> main exiting
> 
> Just to refresh what we have done. We have managed to arrange that a
> totally normal IO action taking an IO callback, usesCB :: IO () -> IO
> (), can be run instead with a callback in a custom monad without
> modification.
> 
> We can do an exactly similar operation on several other common monad
> transformers. In order to highlight the uniformity in the approaches,
> we leave haskell 98 and create a multi-parameter type class:
> 
>  > class InterleavableIO m a | m -> a where
>  >     embed :: (a -> IO b) -> m b
>  >     callback :: m b -> a -> IO b
>  >
> 
> A monad 'm' is 'Interleavable' over IO if there is some type 'a' which
> represents the monad's internals sufficiently that you can write
> 'embed' which strips the internals from the monad and feeds them to
> the callbacks, and 'callback' which given an action in m and the
> internals in an 'a', makes a pure IO callback.
> 
> Here is our state instance, and two more rather similar instances:
> 
>  > class InterleavableIO m a | m -> a where
>  >     embed :: (a -> IO b) -> m b
>  >     callback :: m b -> a -> IO b
>  >
>  > instance InterleavableIO (StateT s IO) (IORef s) where
>  >     embed = embedIO
>  >     callback = makeCallback
>  >
>  > instance InterleavableIO (ReaderT r IO) r where
>  >     embed a = do s <- ask
>  >                  liftIO $ a s
>  >     callback act s = runReaderT act s
>  >
>  > instance Monoid w => InterleavableIO (WriterT w IO) (IORef w) where
>  >     embed a = do x <- liftIO $ newIORef mempty
>  >                  r <- liftIO $ a x
>  >                  w <- liftIO $ readIORef x
>  >                  tell w
>  >                  return r
>  >     callback a x = do (r,w) <- runWriterT a
>  >                       modifyIORef x (mappend w)
>  >                       return r
>  >
> 
> The Writer instance is really very similar to the state one. Writer is
> like a state monad with 'write-only' state. (Append-only state, to be
> precise). The Reader instance is extremely simple; it really just is a
> funny way of writing partial application, after all. It is interesting
> to note, at least, that the Reader instance doesn't need IORefs, so
> IORefs aren't the only possibility for 'internals'.
> 
> I didn't think the Reader instance was interesting enough to
> demonstrate, but here's a demo of the Writer instance, proving that
> the writing all happens in the correct order:
> 
>  > writeact :: WriterT [String] IO ()
>  > writeact = do liftIO $ putStrLn "writeact called!"
>  >               tell ["writeact"]
>  >
>  > wtest :: WriterT [String] IO ()
>  > wtest = do liftIO $ putStrLn "Testing WriterT callbacks"
>  >            tell ["start"]
>  >            embed $ \x -> usesCB (callback writeact x)
>  >            tell ["end"]
>  >            liftIO $ putStrLn $ "Finished"
>  >
> 
> A simple writer action which logs something and prints a debugging
> message, and a test action which calls our old friend 'usesCB' now
> with a WriterT callback. The output:
> 
> *Main> runWriterT wtest
> Testing WriterT callbacks
> usesCB starting
> writeact called!
> usesCB middle
> writeact called!
> usesCB exiting
> Finished
> ((),["start","writeact","writeact","end"])
> 
> 
> So. we've demonstrated interleaving for StateT IO, ReaderT IO and
> WriterT IO. What else can we do? Sometimes your callback is actually
> pure: it doesn't do any IO at all. In these cases, it's nice to have
> the type system 'know' it's pure. Fortunately this is easy to patch in
> to our existing system, since you can promote any pure State, Reader,
> Writer action into xyzT IO. In fact you can do something slightly more
> general:
> 
>  > promoteState :: MonadState s m => State s a -> m a
>  > promoteState act = do s <- get
>  >                       let (r,s') = runState act s
>  >                       put s'
>  >                       return r
>  >
>  > promoteReader :: MonadReader s m => Reader s a -> m a
>  > promoteReader act = do s <- ask
>  >                        return (runReader act s)
>  >
>  > promoteWriter :: MonadWriter w m => Writer w a -> m a
>  > promoteWriter act = do let (r,w) = runWriter act
>  >                        tell w
>  >                        return r
>  >
> 
> Then, given a pure state action:
> 
>  > stateact :: State Int ()
>  > stateact = do x <- get
>  >               put (x*2)
> 
> and a slight specialisation of the promoteState function, to help type
> inference along:
> 
>  > promoteState' :: State s a -> StateT s IO a
>  > promoteState' = promoteState
> 
> we can use a totally pure callback in our same usesCB:
> 
>  > purestatedemo :: StateT Int IO ()
>  > purestatedemo = do
>  >   liftIO $ putStrLn "Pure state demo"
>  >   liftIO . putStrLn . ("state is " ++) . show =<< get
>  >   embed $ \x -> usesCB (callback (promoteState' stateact) x)
>  >   liftIO . putStrLn . ("state is " ++) . show =<< get
> 
> and run it like this:
> 
> *Main> runStateT purestatedemo 1
> Pure state demo
> state is 1
> usesCB starting
> usesCB middle
> usesCB exiting
> state is 4
> ((),4)
> 
> I haven't yet experimented with the type class inference needed to get
> a correct 'embed' instance for a tower of compatible monad
> transformers, but I don't think that would be all that tricky.
> 
> There is one more instance, though, which is a little different. The
> three we have seen so far are all rather similar, in that one way or
> another they're all really a kind of state. Here is a rather different
> instance:
> 
>  > data InterleaveErrorTException e = InterleaveErrorTException e
>  >       deriving (Typeable)
>  >
>  > instance (Error e, Typeable e) =>
>  >          InterleavableIO (ErrorT e IO) () where
>  >     embed act = do
>  >       x <- liftIO $ catchDyn (Right `fmap` act ())
>  >            (\(InterleaveErrorTException e) -> return (Left e))
>  >       case x of Right r -> return r
>  >                 Left e -> throwError e
>  >     callback a () = do
>  >       x <- runErrorT a
>  >       case x of Right r -> return r
>  >                 Left e -> throwDyn
>  >                           (InterleaveErrorTException e)
> 
> Here the trick is to convert between the 'ErrorT' way of signalling
> error conditions, with a Left-value, and the IO way of signalling them
> using exceptions.
> 
> This enables our actions to have custom error handling and still
> interoperate with IO's exception throwing:
> 
>  > failer :: ErrorT String IO ()
>  > failer = do liftIO $ putStrLn "failer called"
>  >             throwError "failer"
>  >             liftIO $ putStrLn "failer exiting"
>  >
>  > succeeder :: ErrorT String IO ()
>  > succeeder = do liftIO $ putStrLn "succeeder called"
>  >                liftIO $ putStrLn "succeeder exiting"
>  >
>  > errtest :: ErrorT String IO ()
>  > errtest = do liftIO $ putStrLn "errtest starting"
>  >              embed $ \x -> usesCB (callback failer x)
>  >              liftIO $ putStrLn "errtest ending"
>  >
>  > errtest2 :: ErrorT String IO ()
>  > errtest2 = do liftIO $ putStrLn "errtest starting"
>  >               embed $ \x -> usesCB (callback succeeder x)
>  >               liftIO $ putStrLn "errtest ending"
>  >
> 
> 
> *Main> runErrorT errtest
> errtest starting
> usesCB starting
> failer called
> Left "failer"
> *Main> runErrorT errtest2
> errtest starting
> usesCB starting
> succeeder called
> succeeder exiting
> usesCB middle
> succeeder called
> succeeder exiting
> usesCB exiting
> errtest ending
> Right ()
> 
> Note how the first test successfully 'short-circuits' out of the IO
> action.
> 
> One important thing to note about this kind of threading is that it
> doesn't attempt to solve the asynchrony problem. It is explicitly for
> callbacks which are used synchronously (in the sense that they are
> never called after the main library routine 'returns') so they
> couldn't be used, as is, for event handlers; also they must be used
> single-threadedly; two parallel StateT callbacks would fight over the
> IORef.
> 
> This last point is actually very easy to solve, however! We can just
> replace the IORef with the obvious MVar operations, and we immediately
> get 'intelligently parallelisable' callbacks: the callbacks would
> serialise w.r.t. each other, using takeMVar, but would not otherwise
> restrict the multi-threaded behaviour of the enclosing action.
> 
> It's a little harder, but not impossible, to extend this so that event
> callbacks, which are called long after the initial action completes,
> can also access state safely; you need a "MVar-aware" state monad at
> the outermost level, but that's easy to write in a way which complies
> with MonadState and thus doesn't require global code changes.
> 
> Of course, this probably extends outside library callbacks. There's
> not currently any way for a StateT action to safely 'forkIO': or,
> rather, of course there is, but the subsidiary thread is not an equal
> citizen. The subsidiary thread is 'merely' an IO action instead of a
> StateT s IO action, but these ideas suggest a resolution to that.
> 
> This email has turned out about five times as long as I intended, so
> I'm going to stop there, although there are still things I wanted to
> bring up that I haven't got around to. I hope some people out there
> recognise the problem I'm trying to address, and I'm interested in all
> comments or corrections.
> 
> Jules
> 
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe