Possible runtime overhead of wrapping the IO monad?

Brian Hulley brianh at metamilk.com
Wed Mar 29 21:50:06 EST 2006

Brian Hulley wrote:
> With -O2 enabled, __ccall_GC duma_vertex3f is indeed called directly
> instead of vertex3f, from a different module, so that proves that
> different monads can indeed be used to wrap IO operations without any
> performance penalty at all.

However I've just discovered there *is* a penalty for converting between 
callback functions that return a different monad from the IO monad. For 
example, if I have a RenderM monad that allows primitives to be drawn to the 
screen, and a callback:

       newtype RenderM a = RenderM (IO a) deriving (Functor, Monad, MonadIO)

       type RenderCallback = Int -> Int -> RenderM ()

where the intention is that the callback will take the width and height of 
the window and return a RenderM action, the problem is that because the FFI 
does not allow RenderM to appear in a foreign type, the actual render 
function has to be converted into a function which returns an IO action 
instead of a RenderM action eg by:

       type RenderCallbackIO = Int -> Int -> IO ()

       dropRenderM :: RenderCallback -> RenderCallbackIO
       dropRenderM f x y = let RenderM io = f x y in io

       foreign import ccall duma_onRender :: FunPtr RenderCallbackIO -> IO 

       foreign import ccall "wrapper" mkRenderCallbackIO
            :: RenderCallbackIO -> IO (FunPtr RenderCallbackIO)

       onRender :: RenderCallback -> IO ()
       onRender f = mkRenderCallbackIO (dropRenderM f) >>= duma_onRender

With -O2 optimization, GHC does not seem to be able to optimize out the call 
to dropRenderM even though this just changes the return value of f from 
RenderM (IO a) to IO a, so RenderM is not transparent after all:

Duma.onRender = \ (f :: Duma.RenderCallback)
    (eta :: GHC.Prim.State# GHC.Prim.RealWorld) ->
    case (# GHC.Prim.State# GHC.Prim.RealWorld, () #) 
              (Duma.dropRenderM f) eta
    of wild { (# new_s, a86 #) ->
    case (# GHC.Prim.State# GHC.Prim.RealWorld, () #) a86
    of ds { GHC.Ptr.FunPtr ds1 ->
    case (# GHC.Prim.State# GHC.Prim.RealWorld,
     () #) {__ccall_GC duma_onRender GHC.Prim.Addr#
     -> GHC.Prim.State# GHC.Prim.RealWorld
     -> (# GHC.Prim.State# GHC.Prim.RealWorld #)}
      ds1 new_s
    of wild1 { (# ds2 #) ->
    (# ds2, GHC.Base.() #)

I must admit I'm not at all clear how to read the -ddump-simpl output so I 
may have got this wrong, but since Duma.dropRenderM is mentioned, I think 
this means this has not been optimized out.

Therefore there does seem to be an overhead for using different monads at 
the moment (?)

Regards, Brian. 

More information about the Glasgow-haskell-users mailing list