[Haskell-beginners] How Best to Deal with Nested Monads?

Thu Sep 15 19:59:30 CEST 2011

> However, I think you are abusing MaybeT
> here.  MaybeT is a transformer which adds failure capability to a
> monad m: a MaybeT m computation can now fail at any point, causing the
> entire computation to fail.

MaybeT m values fail monadically in the same sense that Maybe fails, and the
failure action of the inner m monad is accessed by using runMaybeT. For
example: say we've got a function that opens a file and deserializes an
object of type a from inside:

f :: FilePath -> MaybeT IO a
f filePath = MaybeT $ do
    fileContents <- readFile filePath
    return $ deserialize fileContents

deserialize :: String -> Maybe a
deserialize str = ...

So f can fail either in Maybe (via deserialize) or in IO (via readFile). If
it fails by Maybe, then we get the monadic behavior we expect from Maybe. If
it fails by IO, then the next time we pass it to runMaybeT we'll get the IO
exception.

The Monad instance for MaybeT m is also enlightening:

newtype MaybeT m a = MaybeT { runMaybeT :: m (Maybe a) }

instance (Monad m) => Monad (MaybeT m) where
    fail _ = MaybeT (return Nothing)
    return = lift . return
    x >>= f = MaybeT $ do
        v <- runMaybeT x
        case v of
            Nothing -> return Nothing
            Just y  -> runMaybeT (f y)

So we see that (MaybeT m a) is just a wrapper around (m (Maybe a)). When we
do anything monadic with a MaybeT m value, it gets cracked open and the m
monad is evaluated first.

> However, your Maybe values do *not*
> indicate failure of the entire computation: they just indicate that
> individual values are to be filtered out. The result is that you have
> to be very careful to leave the MaybeT computations inside the list,
> and then manually run them one by one.

They indicate a possibility of failure of the kind that can be filtered out,
as opposed to a failed IO, which indicates some external problem. It's all
gravy because we have these two ways to fail, with the desired Maybe
behavior as the "outer" monad.

> From the types one might
> expect that you could simply do away with sequenceMTAll and use 'mapM'
> in place of 'map', but this would have the wrong semantics.

You're right, and that's the reason for my writing sequenceMTAll.

> Do you have any concrete
> examples where your approach above is either more efficient or more
> incremental than the version I originally gave?

I don't grok Haskell complexity analysis very intuitively yet, so I can't
say anything concrete about efficiency without writing it out like a proof.
But regarding incremental processing: let's look at the mapM solution:

lst >>= (mapM getMB >=> (return . catMaybes) >=> mapM getC)

Written out in do notation (and with the mapM's expanded a bit), it's:

f :: m [a] -> m [c]
f lst = do
    lstA <- lst
    lstMB <- sequence $ map getMB lstA
    let lstB = catMaybes lstMB
    lstC <- sequence $ map getC lstB
    return lstC

We've got two calls to sequence here, so we're evaluating getMB x for every
x in lstA before we ever think about evaluating
getC on any of those results. So we can't get a final result for the first
element of the list until we've computed partial results for all the
elements.

In contrast, using MaybeT lets us get rid of catMaybes and actually compose
getMB with getC. Then we can just sequence the list once (using
sequenceMTAll which mashes up catMaybes semantics with sequence
functionality).

Best,
Mike S Craig
(908) 328 8030

On Thu, Sep 15, 2011 at 12:41 PM, Brent Yorgey <byorgey at seas.upenn.edu>wrote:

> On Thu, Sep 15, 2011 at 01:49:55AM -0400, Michael Craig wrote:
> > Alright, I return from the land of transformers with a solution:
> >
> >
> > import Control.Monad.Trans.Class
> > import Control.Monad.Trans.Maybe
> >
> > main = sequenceMTAll (lst' >>= map (getB' >=> getC'))
> >
> > lst' = lift lst :: MaybeT m [a]
> > getB' = MaybeT . getMB
> > getC' = lift . getC
> >
> > sequenceMTAll :: (Monad m) => [MaybeT m a] -> m [a]
> > sequenceMTAll (x:xs) = do
> >     y <- runMaybeT x
> >     case y of
> >       Nothing -> sequenceMTAll xs
> >       Just z -> sequenceMTAll xs >>= return . (z:)
> > sequenceMTAll [] = return []
>
> This is clever: moving the Maybe processing into the monad lets the
> two maps fuse into one.  However, I think you are abusing MaybeT
> here.  MaybeT is a transformer which adds failure capability to a
> monad m: a MaybeT m computation can now fail at any point, causing the
> entire computation to fail.  However, your Maybe values do *not*
> indicate failure of the entire computation: they just indicate that
> individual values are to be filtered out.  The result is that you have
> to be very careful to leave the MaybeT computations inside the list,
> and then manually run them one by one.  From the types one might
> expect that you could simply do away with sequenceMTAll and use 'mapM'
> in place of 'map', but this would have the wrong semantics.
>
> I am also still not convinced that this actually yields any sort of
> benefit, although I must confess that I often get confused thinking
> about strictness of monadic computations.  Do you have any concrete
> examples where your approach above is either more efficient or more
> incremental than the version I originally gave?
>
> -Brent
>
> >
> >
> > (Of course in real code I'd just modify lst, getMB, getC, etc. to fit the
> > new types. The crux here is sequenceMTAll.)
> >
> > Am I abusing Maybe too much?
> >
> > Mike S Craig
> > (908) 328 8030
> >
> >
> > On Thu, Sep 15, 2011 at 1:15 AM, Brent Yorgey <byorgey at seas.upenn.edu
> >wrote:
> >
> > > On Wed, Sep 14, 2011 at 09:21:04PM -0400, Michael Craig wrote:
> > > > Brent: Thanks for reminding me about (>=>). Far more readable! But
> > > regarding
> > > > the sequence thing: I can think of all sorts of reasons why we'd want
> to
> > > do
> > > > a single traversal. How about when lst is long or infinite? In
> general,
> > > it's
> > > > more useful to produce output incrementally than all at once at the
> > > > end.
> > >
> > > Yes, producing output incrementally is great!  My point is that
> > > usually laziness will take care of it for you, without having to
> > > worry about it specifically.
> > >
> > > In this particular case, most monads will not actually allow
> > > incremental processing anyway.  For example, suppose m = Maybe.  Then
> > > when mapping getMB over lst, any particular element could cause the
> > > whole computation to fail.  So we cannot output anything based on the
> > > first elements in the list until we have processed the entire list,
> > > because until we get to the very end of the list we do not know
> > > whether to begin by outputting 'Just' or 'Nothing'.
> > >
> > > -Brent
> > >
> > > >
> > > > Mike S Craig
> > > > (908) 328 8030
> > > >
> > > >
> > > > On Wed, Sep 14, 2011 at 8:18 PM, Brent Yorgey <
> byorgey at seas.upenn.edu
> > > >wrote:
> > > >
> > > > > On Wed, Sep 14, 2011 at 06:48:29PM -0400, Michael Craig wrote:
> > > > > > Say we've got these types
> > > > > >
> > > > > > lst :: m [a]
> > > > > > getMB :: a -> m (Maybe b)
> > > > > > getC :: b -> m c
> > > > > >
> > > > > > and we want to map getMB and getC over the elements of lst, all
> the
> > > while
> > > > > > discarding elements x where getMB x == Nothing.
> > > > > >
> > > > > > (This could be generalized more by replacing Maybe with some
> monad
> > > m',
> > > > > but
> > > > > > let's run with Maybe because it's easy to talk about.)
> > > > > >
> > > > > > The best I've got (after some help on IRC) is this
> > > not-so-easy-to-read
> > > > > > oneliner:
> > > > > >
> > > > > > lst >>= (\x -> mapM (liftM (liftM getC) (getMB x)) >>= sequence
> > > > > > . catMaybes
> > > > >
> > > > > How about this:
> > > > >
> > > > >  lst >>= (mapM getMB >=> (return . catMaybes) >=> mapM getC)
> > > > >
> > > > > Everyone always forgets about (>=>).
> > > > >
> > > > > > This is hard to read, but it's also bad because we run sequence
> twice
> > > > > (once
> > > > > > inside of mapM). If we want to do multiple things to each element
> of
> > > lst,
> > > > > it
> > > > > > would be nice to process each element completely before moving on
> to
> > > the
> > > > > > next.
> > > > >
> > > > > I wouldn't worry about running sequence twice.  Processing things
> by
> > > > > chaining whole-structure transformations is the Haskell Way (tm).
>  All
> > > > > that business about "doing only one traversal" is for people
> > > > > programming in strict languages to worry about. The compiler can
> often
> > > > > turn a chain of wholesale transformations into a single traversal
> > > > > anyway.  In short, I see no particular reason why it is "nice" to
> > > > > process each element completely before moving on.  Isn't it nicer
> to
> > > > > be able to think in a more modular style?
> > > > >
> > > > > -Brent
> > > > >
> > > > > _______________________________________________
> > > > > Beginners mailing list
> > > > > Beginners at haskell.org
> > > > > http://www.haskell.org/mailman/listinfo/beginners
> > > > >
> > >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/beginners/attachments/20110915/5986fc33/attachment-0001.htm>