[Haskell-cafe] Much faster complex monad stack based on CPS state

Nicu Ionita nicu.ionita at acons.at
Wed Sep 28 22:41:09 CEST 2011


Am 28.09.2011 14:05, schrieb Yves Parès:
> Interesting, so what have you used to get that speedup?
> A monad stack of ContT and State (*)? Just the Cont monad?
>
This is a module with a state monad transformer that I used before (the 
name STPlus is misleading - and sorry for the long email):

{-# LANGUAGE RankNTypes, MultiParamTypeClasses, FlexibleInstances #-}

module Search.SearchMonad (
     STPlus,
     return, (>>=),
     get, put, gets, modify,
     lift, liftIO,
     runSearch, execSearch
     ) where

import Control.Monad
import Control.Monad.State hiding (lift, gets, modify)

newtype STPlus s m a = STPlus { runSTPlus :: s -> m (a, s) }
{-# INLINE runSTPlus #-}

instance Monad m => Monad (STPlus s m) where
     {-# INLINE return #-}
     return v = STPlus (\s -> return (v, s))
     {-# INLINE (>>=) #-}
     (>>=)    = bindSTPlus

{-# INLINE bindSTPlus #-}
bindSTPlus :: Monad m => STPlus s m a -> (a -> STPlus s m b) -> STPlus s m b
bindSTPlus ms f = STPlus $ \s -> case runSTPlus ms s of
                                      m -> m >>= \(v', s') -> case f v' of
                                                                 fv -> 
runSTPlus fv s'

instance Monad m => MonadState s (STPlus s m) where
     {-# INLINE get #-}
     get   = STPlus $ \s -> return (s,  s)
     {-# INLINE put #-}
     put s = STPlus $ \_ -> return ((), s)

instance MonadIO m => MonadIO (STPlus s m) where
     {-# INLINE liftIO #-}
     liftIO = lift . liftIO

runSearch :: Monad m => STPlus s m a -> s -> m (a, s)
runSearch = runSTPlus

execSearch ms s = liftM snd $ runSearch ms s

{-# INLINE lift #-}
lift :: Monad m => m a -> STPlus s m a
lift m = STPlus $ \s -> m >>= \v -> return (v, s)

{-# INLINE gets #-}
gets :: Monad m => (s -> a) -> STPlus s m a
-- gets f = STPlus $ \s -> return (f s, s)
gets f = STPlus $ \s -> case f s of fs -> return (fs, s)

{-# INLINE modify #-}
modify :: Monad m => (s -> s) -> STPlus s m ()
modify f = STPlus $ \s -> case f s of fs -> return ((), fs)

And this is how the module looks now:

{-# LANGUAGE RankNTypes, MultiParamTypeClasses, FlexibleInstances #-}

module Search.SearchMonadCPS (
     STPlus,
     return, (>>=),
     get, put, gets, modify,
     lift, liftIO,
     runSearch, execSearch
     ) where

import Control.Monad
import Control.Monad.State hiding (lift, gets, modify)

newtype STPlus r s m a = STPlus { runSTPlus :: s -> (a -> s -> m r) -> m r }

instance Monad (STPlus r s m) where
     return a = STPlus $ \s k -> k a s
     c >>= f  = STPlus $ \s0 k -> runSTPlus c s0 $ \a s1 -> runSTPlus (f 
a) s1 k

instance MonadState s (STPlus r s m) where
     get   = STPlus $ \s k -> k s  s
     put s = STPlus $ \_ k -> k () s

instance MonadIO m => MonadIO (STPlus r s m) where
     {-# INLINE liftIO #-}
     liftIO = lift . liftIO

runSearch :: Monad m => STPlus (a, s) s m a -> s -> m (a, s)
runSearch c s = runSTPlus c s $ \a s0 -> return (a, s0)

execSearch ms s = liftM snd $ runSearch ms s

{-# INLINE lift #-}
lift :: Monad m => m a -> STPlus r s m a
lift m = STPlus $ \s k -> m >>= \a -> k a s

{-# INLINE gets #-}
gets :: Monad m => (s -> a) -> STPlus r s m a
gets f = STPlus $ \s k -> k (f s) s

{-# INLINE modify #-}
modify :: Monad m => (s -> s) -> STPlus r s m ()
modify f = STPlus $ \s k -> k () (f s)

And then I have (in different modules):

Client code (starting an PV search to a given depth):

type CtxIO = ReaderT Context IO

bestMoveCont :: Int -> MyState -> Maybe Int -> [Move] -> [Move] -> CtxIO 
IterResult
bestMoveCont ... = do
      ...
      ((sc, path, rmvsf), statf) <- runSearch (alphaBeta abc) stati
      ...

Search framework:

class Monad m => Node m where
     staticVal :: m Int  -- static evaluation of a node
     materVal  :: m Int  -- material evaluation (for prune purpose)
     genEdges :: Int -> Int -> Bool -> m ([Move], [Move])  -- generate 
all legal edges
     genTactEdges :: m [Move]  -- generate all edges in tactical positions
...
type Search m a = forall r. STPlus r PVState m a

alphaBeta :: Node m => ABControl -> m (Int, [Move], [Move])
alphaBeta abc = do
     let !d = maxdepth abc
         rmvs = Alt $ rootmvs abc
         lpv  = Seq $ lastpv abc
         searchReduced a b = pvRootSearch a       b      d lpv rmvs True
         searchFull        = pvRootSearch salpha0 sbeta0 d lpv rmvs False
     r <- if useAspirWin
...
pvRootSearch :: Node m => Int -> Int -> Int -> Seq Move -> Alt Move -> Bool
              -> Search m (Int, Seq Move, Alt Move)
...

And then the chess specific implementation of the game state in another 
module:

type Game r m = STPlus r MyState m
...
instance CtxMon m => Node (Game r m) where
     staticVal = staticVal0
     materVal  = materVal0
     genEdges = genMoves
...
genMoves :: CtxMon m => Int -> Int -> Bool -> Game r m ([Move], [Move])
genMoves depth absdp pv = do

Nicu
> (*) If so, were you using the strict version of State?
>
> Would it be possible to see the differences between the 2 versions of 
> you code?
>
> 2011/9/27 Nicu Ionita <nicu.ionita at acons.at <mailto:nicu.ionita at acons.at>>
>
>     Hello list,
>
>     Starting from this emails
>     (http://web.archiveorange.com/archive/v/nDNOvSM4JT3GJRSjOm9P) I
>     could refactor my code (a UCI chess engine, with complex
>     functions, in which the search has a complex monad stack) to run
>     twice as fast as with even some hand unroled state transformer! So
>     from 23-24 kilo nodes per second it does now 45 to 50 kNps! And it
>     looks like there is still some improvement room (I have to play a
>     little bit with strictness annotations and so on).
>
>     (Previously I tried specializations, then I removed a lot of
>     polimorphism, but nothing helped, it was like hitting a wall.)
>
>     Even more amazingly is that I could program it although I cannot
>     really understand the Cont & ContT, but just taking the code
>     example from Ryan Ingram (newtype ContState r s a = ...) and
>     looking a bit at the code from ContT (from the transformers
>     library), and after fixing some compilation errors, it worked and
>     was so fast.
>
>     I wonder why the transformers library does not use this kind of
>     state monad definition. Or does it, and what I got is just because
>     of the unrolling? Are there monad (transformers) libraries which
>     are faster? I saw the library kan-extensions but I did not
>     understand (yet) how to use it.
>
>     Nicu
>
>     _______________________________________________
>     Haskell-Cafe mailing list
>     Haskell-Cafe at haskell.org <mailto:Haskell-Cafe at haskell.org>
>     http://www.haskell.org/mailman/listinfo/haskell-cafe
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20110928/5f8c4d2d/attachment.htm>


More information about the Haskell-Cafe mailing list