[Haskell-cafe] Fun with the ST monad

Fri Feb 25 03:16:23 CET 2011

On 2/24/11 3:45 PM, Andrew Coppin wrote:
> OK, so I had a function that looks like
>
> transform :: [Word8] -> [Word16]
>
> It works nicely, but I'd like to use mutable state inside. No problem!
> Use the ST monad. Something like
>
> transform :: [Word8] -> [Word16]
> transform xs = runST (work xs)
> where
> work :: [Word8] -> ST s [Word16]
>
> Ah, yes, well there is one *small* problem... If you do that, the
> function becomes too strict.

Given only this specification, the problem is overconstrained, which is 
why you get too much strictness. That is, your types are too general to 
allow you to do what you want (e.g., they allow the first Word16 to 
depend on the last Word8). What is it that transform is supposed to do?

As for figuring out how to do it, first consider the following:

     -- | @fix (PreList a) == [a]@ modulo extra bottoms.
     type PreList a b = Maybe (a,b)

     fmap_PreList :: (b -> c) -> PreList a b -> PreList a c
     fmap_PreList f Nothing     = Nothing
     fmap_PreList f (Just(a,b)) = Just (a, f b)

     enlist :: PreList a [a] -> [a]
     enlist Nothing       = []
     enlist (Just (x,xs)) = x:xs

     prelist :: [a] -> PreList a [a]
     prelist []     = Nothing
     prelist (x:xs) = Just (x,xs)

     -- | Monadic version of @Data.List.unfoldr at .
     unfoldM :: (Monad m) => (b -> m (PreList a b)) -> (b -> m [a])
     unfoldM coalgM b = do
         m <- coalgM b
         case m of
             Nothing     -> return []
             Just (a,b') -> (a:) `liftM` unfoldM coalgM b'

Assuming that we can generate the elements of [Word16] incrementally, 
then this function almost gives us what we need. The problem is that 
even though the (a:) part is pure by the time we reach it, we can't see 
that fact because of the liftM pushing it down into the monad again. To 
put this a different way, consider the following distributive law:

     distList :: (Monad m) => m (PreList a (m [a])) -> m [a]
     distList mx_mxs = do
         maybe_x_mxs <- mx_mxs
         case maybe_x_mxs of
             Nothing      -> return []
             Just (x,mxs) -> (x:) `liftM` mxs

     {- N.B.,
     unfoldM coalgM == distList . mfmap (unfoldM coalgM) . coalgM
         where
         mfmap :: (b -> c) -> m (PreList a b) -> m (PreList a c)
         mfmap = liftM . fmap_PreList
     -}

In order to factor out the (a:) constructor we need to find some way of 
*not* using distList in unfoldM. That way, the monadic effects 
associated with the head of the list can be separated from the effects 
associated with the tail of the list. Unfortunately, the obvious attempt 
doesn't typecheck.

     unfoldM'
         :: (Monad m)
         => (b -> m (PreList a b))
         -> b -> fix (\rec -> m (PreList a rec))
     unfoldM' coalgM = mfmap (unfoldM' coalgM) . coalgM

One problem is the fact that we can't write infinite types, though we 
can get around that easily by using a newtype. The other problem is that 
we need a function for running ST in a way that allows nested ST to be 
run at some later time. Something like,

     semirunST :: (Functor f)
               => (forall s. ST s (f (ST s a))) -> f (ST s a)

You can't do that in ST, since allowing this would mean that multiple 
evaluations of the (ST s a) embedded in the result could return 
different answers and communicate with one another[1]. However, if you 
use another monad for encapsulating memory regions (e.g., ST RealWorld, 
STM, IO) then you can probably get away with it.

But you're probably better off using State[2] instead of ST. Or 
converting the whole thing to an iteratee-style computation which is 
more explicit about the type of stream processing involved and thus what 
kinds of laziness are possible.

[1] Though it would be safe to combine it with the newtype:

     newtype Compose f g x = Compose (f (g x))
     newtype Fix f = Fix (f (Fix f))
     interleaveST :: (Functor f) => Fix (Compose (ST s) f) -> Fix f

But given the API for ST, you can't define interleaveST in a way that 
actually interleaves evaluation instead of using a distributive law for 
pulling the (ST s) up over f and then running everything at once.

[2] State is easy:

     runfoldState :: (b -> State s (PreList a b)) -> b -> s -> [a]
     runfoldState coalgM = evalState . rec
         where
         rec b = do
             m <- coalgM b
             case m of
                 Nothing     -> return []
                 Just (a,b') -> do
                     s <- get
                     return (a : evalState (rec b') s)

-- 
Live well,
~wren