[Haskell-cafe] Observations about foldM
Jason McCarty
jmccarty at sent.com
Wed Aug 19 00:14:24 EDT 2009
[This message is literate Haskell]
Hi -cafe,
I was trying to use Monad.foldM in the State monad recently and ran into some
performance issues. They were eventually fixed with seq, but along the way I
made some discoveries, which I thought I would share.
The Report defines foldM as
foldM :: (Monad m) => (a -> b -> m a) -> a -> [b] -> m a
foldM f z [] = return z
foldM f z (x:xs) = f z x >>= \z' -> foldM f z' xs
It turns out that foldM is essentially a right fold. Define
f' = \h t -> flip f h >=> t. The operator (>=>) is (reverse) Kleisli
composition, defined in Control.Monad as f >=> g = \x -> f x >>= g. Now
foldM f z xs = foldr f' return xs z.
Proof. On the empty list,
foldM f z [] = return z
[definition of foldM]
= foldr f' return [] z
[definition of foldr]
Now fix a list xs and inductively assume that
foldM f z xs = foldr f' return xs z for all z. Then for any z and x,
foldM f z (x:xs) = f z x >>= \z' -> foldM f z' xs
[definition of foldM]
= f z x >>= \z' -> foldr f' return xs z'
[inductive hypothesis]
= f z x >>= foldr f' return xs
[eta-conversion(*)]
= flip f x z >>= foldr f' return xs
[definition of flip]
= (\z' -> flip f x z' >>= foldr f' return xs) z
[beta-reduction]
= (flip f x >=> foldr f' return xs) z
[definition of >=>]
= (f' x (foldr f' return xs)) z
[definition of f']
= foldr f' return (x:xs) z
[definition of foldr]
(*) The eta-conversion preserves strictness as long as return /= _|_, since
f >=> g is in WHNF.
Interestingly, foldM can also be written as a left fold. To see this, note that
it is a theorem that foldr f z xs = foldl f z xs as long as f is associative
and z is a unit for f. Since (>=>) is associative with unit return, we have
foldr (\h t -> flip f h >=> t) return = foldr (>=>) return . map (flip f)
[foldr/map fusion]
= foldl (>=>) return . map (flip f)
[by the theorem]
= foldl (\t h -> t >=> flip f h) return
[foldl/map fusion]
Therefore foldM f z xs = foldr f' return xs z = foldl f'' return xs z, where
f'' = \t h -> t >=> flip f h. But this doesn't mean these all have the same
performance characteristics.
Exercise for the reader: find examples where the foldr version performs better
than the foldl version and vice-versa. I've noticed this distinction between
State and [] in particular. They both may require use of seq in the function f
to prevent stack overflow.
Here is a module implementing these versions of foldM. It seems reasonable to
have versions with <=< and with f unflipped as well, but those could be derived
from the functions below.
> module FoldM (foldrM, foldr1M, foldlM, foldl1M) where
> import Control.Monad((>=>))
> -- foldrM nests to the right:
> -- foldrM f z [x1, ..., xn] = (flip f x1 >=> (... >=> (flip f xn)...)) $ z
> foldrM :: Monad m => (a -> b -> m a) -> a -> [b] -> m a
> foldrM f z xs = foldr (\h t -> flip f h >=> t) return xs z
Surprisingly, this may or may not be faster than the equivalent
foldrM f z = return z >>= foldr (\h t -> flip f h >=> t) return xs
depending on the monad. But they're both slower than foldM.
> foldr1M f (x:xs) = foldrM f x xs
> -- foldlM nests to the left:
> -- foldlM f z [x1, ..., xn] = (...(flip f x1) >=> ...) >=> flip f xn) $ z
> foldlM :: Monad m => (a -> b -> m a) -> a -> [b] -> m a
> foldlM f z xs = foldl (\t h -> t >=> flip f h) return xs z
Again, this is apparently faster or slower than either of
foldlM f z xs = return z >>= foldl (\t h -> t >=> flip f h) return xs
foldlM f = foldl (\t h -> t >>= flip f h) . return
depending on the monad.
> foldl1M f (x:xs) = foldlM f x xs
--
Jason McCarty <jmccarty at sent.com>
More information about the Haskell-Cafe
mailing list