Proposal: Add strictly filtering filterM' to Control.Monad

Wed Sep 3 23:45:48 UTC 2014

We have filterM, that looks like this (although I'm trying to get it
changed to use foldr):

filterM :: (Monad m) => (a -> m Bool) -> [a] -> m [a]
filterM _ [] = return []
filterM p (x:xs) = do
  flg <- p x
  ys <- filterM p xs
  return (if flg then x:ys else ys)

This can cause a problem in some cases when used with IO, strict ST, or
similar. In particular, it doesn't actually discard any elements of the
list until all the predicate computations have run. Rather, it accumulates
a chain of closures containing elements of the input and (possibly
unevaluated) results of the (p x) computations. If the result of (p x) is
usually False, so the input list is much longer than the output list, this
can lead a program to run out of memory unnecessarily. This behavior cannot
be avoided without changing semantics in some cases, as Dan Doel pointed
out to me. In particular, a computation (p x) may run successfully to
completion but produce a bottom value.

PROPOSAL

Add a function like this:

filterM' :: (Monad m) => (a -> m Bool) -> [a] -> m [a]
filterM' p = go []
  where
    go _   [] = return []
    go acc (x:xs) =
      do
        flq <- p x
        ys <- if flq then go (x:acc) xs else go acc xs
        return (reverse acc)

This can be twisted somewhat harder into a right fold if we want:

filterM' :: (Monad m) => (a -> m Bool) -> [a] -> m [a]
filterM' p xs = foldr go return xs []
  where
    go x r = \acc -> do
             flq <- p x
             ys <- if flq then r (x:acc) else r acc
             return (reverse acc)

The advantage of this function is that instead of building a chain of
closures proportional to the size of its *input* (but likely larger), it
builds a list equal in size to its (possibly much smaller) *output*.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/libraries/attachments/20140903/ff98e27a/attachment-0001.html>