Possible runtime overhead of wrapping the IO monad?
Brian Hulley
brianh at metamilk.com
Wed Mar 29 21:50:06 EST 2006
Brian Hulley wrote:
> With -O2 enabled, __ccall_GC duma_vertex3f is indeed called directly
> instead of vertex3f, from a different module, so that proves that
> different monads can indeed be used to wrap IO operations without any
> performance penalty at all.
However I've just discovered there *is* a penalty for converting between
callback functions that return a different monad from the IO monad. For
example, if I have a RenderM monad that allows primitives to be drawn to the
screen, and a callback:
newtype RenderM a = RenderM (IO a) deriving (Functor, Monad, MonadIO)
type RenderCallback = Int -> Int -> RenderM ()
where the intention is that the callback will take the width and height of
the window and return a RenderM action, the problem is that because the FFI
does not allow RenderM to appear in a foreign type, the actual render
function has to be converted into a function which returns an IO action
instead of a RenderM action eg by:
type RenderCallbackIO = Int -> Int -> IO ()
dropRenderM :: RenderCallback -> RenderCallbackIO
dropRenderM f x y = let RenderM io = f x y in io
foreign import ccall duma_onRender :: FunPtr RenderCallbackIO -> IO
()
foreign import ccall "wrapper" mkRenderCallbackIO
:: RenderCallbackIO -> IO (FunPtr RenderCallbackIO)
onRender :: RenderCallback -> IO ()
onRender f = mkRenderCallbackIO (dropRenderM f) >>= duma_onRender
With -O2 optimization, GHC does not seem to be able to optimize out the call
to dropRenderM even though this just changes the return value of f from
RenderM (IO a) to IO a, so RenderM is not transparent after all:
Duma.onRender = \ (f :: Duma.RenderCallback)
(eta :: GHC.Prim.State# GHC.Prim.RealWorld) ->
case (# GHC.Prim.State# GHC.Prim.RealWorld, () #)
Duma.mkRenderCallbackIO
(Duma.dropRenderM f) eta
of wild { (# new_s, a86 #) ->
case (# GHC.Prim.State# GHC.Prim.RealWorld, () #) a86
of ds { GHC.Ptr.FunPtr ds1 ->
case (# GHC.Prim.State# GHC.Prim.RealWorld,
() #) {__ccall_GC duma_onRender GHC.Prim.Addr#
-> GHC.Prim.State# GHC.Prim.RealWorld
-> (# GHC.Prim.State# GHC.Prim.RealWorld #)}
ds1 new_s
of wild1 { (# ds2 #) ->
(# ds2, GHC.Base.() #)
}
}
}
I must admit I'm not at all clear how to read the -ddump-simpl output so I
may have got this wrong, but since Duma.dropRenderM is mentioned, I think
this means this has not been optimized out.
Therefore there does seem to be an overhead for using different monads at
the moment (?)
Regards, Brian.
More information about the Glasgow-haskell-users
mailing list