[Haskell-cafe] generalized, tail-recursive left fold that can

Wed Feb 20 05:23:34 CET 2013

> > That said, to express foldl via foldr, we need a higher-order
> > fold. There are various problems with higher-order folds, related to
> > the cost of building closures. The problems are especially severe 
> > in strict languages or strict contexts. Indeed,
> > 
> > foldl_via_foldr f z l = foldr (\e a z -> a (f z e)) id l z
> > 
> > first constructs the closure and then applies it to z. The closure has
> > the same structure as the list -- it is isomorphic to the
> > list. However, the closure representation of a list takes typically
> > quite more space than the list. So, in strict languages, expressing
> > foldl via foldr is a really bad idea. It won't work for big lists.
>
> If we unroll foldr once (assuming l is not empty), we'll get
>
>   \z -> foldr (\e a z -> a (f z e)) id (tail l) (f z (head l))
>
> which is a (shallow) closure. In order to observe what you describe (a
> closure isomorphic to the list) we'd need a language which does
> reductions inside closures.

I should've elaborated this point.

Let us consider monadic versions of foldr and foldl. First, monads,
sort of emulate strict contexts, making it easier to see when
closures are constructed. Second, we can easily add tracing.

import Control.Monad.Trans

-- The following is just the ordinary foldr, with a specialized
-- type for the seed: m z
foldrM :: Monad m =>
          (a -> m z -> m z) -> m z -> [a] -> m z
-- The code below is identical to that of foldr
foldrM f z [] = z
foldrM f z (h:t) = f h (foldrM f z t)

-- foldlM is identical Control.Monad.foldM 
-- Its code is shown below for reference.
foldlM, foldlM' :: Monad m =>
          (z -> a -> m z) -> z -> [a] -> m z
foldlM f z []    = return z
foldlM f z (h:t) = f z h >>= \z' -> foldlM f z' t

t1 = foldlM (\z a -> putStrLn ("foldlM: " ++ show a) >>
                     return (a:z)) [] [1,2,3]

{-
foldlM: 1
foldlM: 2
foldlM: 3
[3,2,1]
-}

-- foldlM' is foldlM expressed via foldrM
foldlM' f z l = 
    foldrM (\e am -> am >>= \k -> return $ \z -> f z e >>= k)
           (return return) l >>= \f -> f z

-- foldrM'' is foldlM' with trace printing
foldlM'' :: (MonadIO m, Show a) =>
          (z -> a -> m z) -> z -> [a] -> m z
foldlM'' f z l = 
    foldrM (\e am -> liftIO (putStrLn $ "foldR: " ++ show e) >>
            am >>= \k -> return $ \z -> f z e >>= k)
           (return return) l >>= \f -> f z

t2 = foldlM'' (\z a -> putStrLn ("foldlM: " ++ show a) >>
                       return (a:z)) [] [1,2,3]

{-
foldR: 1
foldR: 2
foldR: 3
foldlM: 1
foldlM: 2
foldlM: 3
[3,2,1]
-}

As we can see from the trace printing, first the whole list is
traversed by foldR and the closure is constructed. Only after foldr
has finished, the closure is applied to z ([] in our case), and
foldl's function f gets a chance to work. The list is effectively
traversed twice, which means the `copy' of the list has to be
allocated -- that is, the closure that incorporates the calls to
f e1, f e2, etc.