[Haskell-cafe] Much faster complex monad stack based on CPS state
Nicu Ionita
nicu.ionita at acons.at
Wed Sep 28 22:55:07 CEST 2011
Am 28.09.2011 02:35, schrieb Ryan Ingram:
> My guess is that Cont plays really nicely with GHC's inliner, so
> things that end up looking like
>
> return x >>= \y -> ...
>
> get optimized really well
>
> return x >>= f
> -- inline >>=
> = ContState $ \s0 k -> runCS (return x) s0 $ \a s1 -> runCS (f a) s1 k
> -- inline return
> = ContState $ \s0 k -> runCS (ContState $ \s2 k2 -> k2 x s2) s0 $
> \a s1 -> runCS (f a) s1 k
> -- runCS record selector
> = ContState $ \s0 k -> (\s2 k2 -> k2 x s2) s0 $ \a s1 -> runCS (f
> a) s1 k
> -- beta
> = ContState $ \s0 k -> (\k2 -> k2 x s0) $ \a s1 -> runCS (f a) s1 k
> -- beta
> = ContState $ \s0 k -> (\a s1 -> runCS (f a) s1 k) x s0
> -- beta
> = ContState $ \s0 k -> runCS (f x) s0 k
>
> and then further inlining of f can take place.
I was even thinking - and this would have been the next idea to try if I
couldn't get your example code to run so fast - to define some rules for
the state monad (transformer) to "fuse" such expressions like
m >>= f >>= g = ...
or even
modify f >>= modify g = modify (g . f)
and perhaps other variations, so that it would perhaps end up in some
nice combination of f and g, avoiding the intermediate tuples, hopefully
with better performance. But then I did not follow it, and I want to
concentrate on further improvements with the new code. The way is still
long, because the top engines (written in C or C++) can do about 10 mil
nps on my machine :-)
Nicu
More information about the Haskell-Cafe
mailing list