[Haskell-cafe] Wrapping the IO monad to get safe,
self-describing imperative APIs
Brian Hulley
brianh at metamilk.com
Thu Mar 30 15:51:20 EST 2006
Hi -
In a discussion started on the GHC mailing list
http://www.haskell.org//pipermail/glasgow-haskell-users/2006-March/009923.html I
discovered an idea for typing imperative API functions that may be of
interest to other people, and which makes use of Haskell's type system to
achieve a level of self-description and static bad-usage detection
impossible in C/C++ APIs.
A nicely formatted (by trac) description of the advantages and how to write
such APIs is available at http://hackage.haskell.org/trac/ghc/ticket/736 (a
feature request I submitted to the GHC team to make it easier to write such
APIs) and the plain text is included at the end of this post.
The basic idea is that different monads, which are just newtypes of the IO
monad, can be used to prevent API functions being called in the wrong
context. For example, consider the following C function which implements a
render callback using a simplified version of DirectX to draw a square on
the screen:
void Render(int width, int height){
Clear();
BeginScene();
DrawSquare();
EndScene();
}
In the C code, the fact that DrawSquare() can only be called between
Begin/EndScene, and the fact that Clear(), BeginScene(), EndScene() can only
be called in a render callback (as opposed to a keypress callback for
example) are completely implicit, and must be borne in mind by the user who
has to wade through heaps of documentation to guess at this understanding.
By using different monads in Haskell, the above function could be written as
follows:
newtype RenderM a = RenderM (IO a) deriving (Functor, Monad, MonadIO)
newtype DrawM a = DrawM (IO a) deriving (Functor, Monad, MonadIO)
type RenderCallback = Int -> Int -> RenderM ()
onRender :: RenderCallback -> IO ()
clear :: RenderM ()
scene :: DrawM () -> RenderM ()
drawSquare :: DrawM ()
render :: RenderCallback
render w h = do
clear
scene $ do
drawSquare
making it impossible to call drawSquare in any situation that the API did
not intend, and also making it easy to see that in order to draw something,
the drawing will need to be an argument of some function which makes use of
DrawM () eg scene, which in turn needs to be an arg of some function which
takes a RenderM () thus leading from inside out: drawSquare --> scene -->
RenderCallback --> onRender.
A consequence of all this is that it would be necessary (to completey
enforce API correct usage) to have an extra optional entry point for Haskell
programs, to prevent APIs being re-started by lifting their init functions
into a callback monad (more details at the end of this post and the trac
report).
Regards, Brian.
#736: Allowing any newtype of the IO monad to be used in FFI and extra
optional
entry point
------------------------------------+---------------------------------------
Reporter: brianh at metamilk.com | Owner:
Type: feature request | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 6.4.1
Severity: normal | Keywords: FFI foreign monad entry
point
Os: Multiple | Difficulty: Unknown
Architecture: Multiple |
------------------------------------+---------------------------------------
Hi -
When designing an API it is desirable to be able to encode the correct
usage patterns for functions in the API in the type of the functions
themselves, rather than relying on the user understanding the
documentation and having to use runtime checks to ensure correct usage.
Consider the following callback which uses a typical C API (DirectX) to
draw something to the screen:
{{{
void Render(int width, int height){
Clear();
BeginScene();
DrawSquare();
EndScene();
}
}}}
In Haskell with the FFI at present, we can define an equivalent API and
use it as follows:
{{{
type RenderCallback = Int -> Int -> IO ()
clear :: IO ()
scene :: IO () -> IO ()
drawSquare :: IO ()
onRender :: RenderCallback -> IO ()
runGraphicsWindow :: IO () -> IO ()
render :: RenderCallback
render w h = do
clear
scene $ do
drawSquare
main = runGraphicsWindow $ do
onRender render
}}}
This is all very well, but just like the C equivalent, it doesn't encode
the fact that drawSquare can only be called between BeginScene and
EndScene. For example the following render callback would result in a
runtime error or at least an unexpected result for the user:
{{{
badRender w h = drawSquare
}}}
To allow the type checker to enforce correct usage, we can use different
monads which just wrap the IO monad as follows:
{{{
newtype RenderM a = RenderM (IO a) deriving (Functor, Monad, MonadIO)
newtype DrawM a = DrawM (IO a) deriving (Functor, Monad, MonadIO)
type RenderCallback = Int -> Int -> RenderM ()
clear :: RenderM ()
scene :: DrawM () -> RenderM ()
drawSquare :: DrawM ()
}}}
Now the good render function is well typed and the badRender function is
ill typed.
With the current GHC implementation, it is possible to provide the
interface above by using some fiddly wrapper functions to remove the
wrapper monads and replace them with the IO monad, for example:
{{{
type RenderCallbackIO = Int -> Int -> IO ()
foreign import ccall "wrapper" mkRenderCallbackIO ::
RenderCallbackIO -> IO (FunPtr RenderCallbackIO)
dropRenderM :: RenderCallback -> RenderCallbackIO
dropRenderM f x y = let RenderM io = f x y in io
foreign import ccall api_onRender :: FunPtr RenderCallbackIO -> IO ()
onRender :: RenderCallback -> IO ()
onRender f = mkRenderCallbackIO (dropRenderM f) >>= api_onRender
foreign import ccall api_clear :: IO ()
clear :: RenderM ()
clear = liftIO $ api_clear
}}}
As far as I can tell, GHC currently optimizes out all the overhead
involved in converting between RenderM and IO. However the extra
marshalling functions are fiddly to write, in particular since different
versions of dropRenderM would be needed for different numbers of arguments
in whatever function returns something in RenderM, and all these extra
functions also obscure the simplicity of the original design.
Therefore I propose that for any monad M defined by:
{{{
newtype M a = M (IO a) deriving (Functor, Monad, MonadIO)
}}}
M a should be able to appear in place of IO a anywhere in a foreign
function definition since all 'M' does is to enforce typing on the Haskell
side and has no relevance to the foreign language API, just as IO has no
relevance to the foreign language either. This would mean we'd no longer
have to write extra wrapper functions and rely on the compiler optimizing
them out.
A related point is that the "API-safety == type correctness" gained by
using different monads can at the moment be subverted because the entry
point into a Haskell program is the main function which returns a value of
type IO (). This means that initialization code for any API must be able
to run in the IO monad. However every monad discussed above allows you to
lift IO operations into it, so there is nothing to stop someone trying to
make a nested re-initialization of the API in the middle of a callback...
It is necessary to allow IO actions to be lifted into the callback monads
so the callbacks can make use of IORefs etc. However it is undesirable to
allow the API to be re-initialized (in such a nested way).
Therefore I propose (perhaps this should have been a separate ticket but I
don't know how to link two tickets together so I've bundled both issues in
this ticket) that there should be an alternative entry point into a
Haskell program with the following type:
{{{
newtype MainM a = MainM (IO a) deriving (Functor, Monad, MonadIO)
_main :: MainM ()
}}}
with this default implementation:
{{{
_main = liftIO $ main
}}}
so that all existing programs will still work.
If _main is explicitly defined by the user, the user's definition should
be used instead, and any definition of "main" will have no special
significance. This would allow the API's initialization function to be
safely typed as:
{{{
runGraphicsWindow :: IO () -> MainM ()
_main = runGraphicsWindow $ do
onRender render
}}}
Thus the user would be prevented from making nested calls to the
initialization function.
More information about the Haskell-Cafe
mailing list