Avoiding CAF's
Neil Mitchell
ndmitchell at gmail.com
Fri May 18 09:38:34 EDT 2007
Hi Ian and Simon,
> Ian said:
> Does the boxing not get optimised out?
> Is the FFI imported function exported from the module?
http://hpaste.org/1882 (replicated at the end of this message in case
the hpaste is not around forever, but clearly layout and syntax
colouring)
Thats the main branch, which is the bit I want to make go faster, if
at all possible. The FFI call is not exported, I have module
Main(main) at the top. From what I can see, the function is being
called, then:
case Main.$wccall GHC.Prim.realWorld# of wild_X28 { (# ds_d2ad,
ds1_d2ac #) ->
i.e. it has had an artificial box put around the answer. It may be
impossible to eliminate this, but if it is, I'd like to try.
The motivation for all this is:
http://neilmitchell.blogspot.com/2007/05/13-faster-than-ghc.html
> Simon said:
> That is indeed scary. Would you like to give a small example of such a program?
>From the above example, you can note that the first argument to
Main.$sprelude_942_ll107 is an Int (v2_aVr), which is entirely ignored
on the recursive branch, and then on the terminating branch is case'd
in a pointless way (this case comes from a seq). If this parameter
could be removed, I suspect a speedup would result.
The reason this parameter is introduced comes from the code:
overlay_get_char h = inlinePerformIO (getCharIO h)
foreign import ccall unsafe "stdio.h getchar" getchar :: IO CInt
{-# NOINLINE getCharIO #-}
getCharIO h = do
c <- getchar
return $ if c == (-1) then h `seq` (-1) else fromIntegral c
I have artifically threaded h through getCharIO, and deliberately
added a pointless seq, to ensure that the definition inside is not
floated up. If I remove the h `seq` then GHC removes the argument from
overlay_get_char, which turns that into a CAF, which then breaks the
required semantics.
I realise all of this trickery is against the spirit of a pure
functional language, and is making assumptions that are not required
to remain true. Right now I just want the fastest possible benchmarks
though.
Thanks
Neil
More information about the Glasgow-haskell-users
mailing list