Recursive functions and constant parameter closures (inlining/strictness analyzer question)

Tyson Whitehead twhitehead at gmail.com
Thu May 29 23:48:11 EDT 2008


main = print $ foldl' (+) 0 [1..]

with

foldl' f y xs = foldl' y xs
    where foldl' y []     = y
          foldl' y (x:xs) = foldl' (f y x) xs

runs indefinitely with very little memory consumption, while

foldl' f y [] = y
foldl' f y (x:xs) = foldl' f (f y x) xs

rapidly consumes all the machine's memory and dies.


Running ghc with  -ddump-stranal shows the outer foldl' of the first 
gets inlined into main as a call to the following specialized version of 
the inner foldl':

foldl'_sSY [ALWAYS LoopBreaker Nothing] :: GHC.Num.Integer
                                           -> [GHC.Num.Integer]
                                           -> GHC.Num.Integer
[Arity 2
 Str: DmdType SS]
foldl'_sSY =
  \ (y_aj7 [ALWAYS Just S] :: GHC.Num.Integer)
    (ds_dQl [ALWAYS Just S] :: [GHC.Num.Integer]) ->
    case ds_dQl of wild_XH [ALWAYS Just A] {
      [] -> y_aj7;
      : x_aja [ALWAYS Just S] xs_ajb [ALWAYS Just S] ->
        foldl'_sSY (GHC.Num.plusInteger y_aj7 x_aja) xs_ajb
    }


Doing the same with the second foldl' shows it to remains non-inlined 
and fully polymorphic:

foldl'_sQN [ALWAYS LoopBreaker Nothing] :: forall t_auW t_av2.
                                           (t_av2 -> t_auW -> t_av2) -> 
t_av2 -> [t_auW] -> t_av2
[Arity 3
 Str: DmdType LLS]
foldl'_sQN =
  \ (@ t_auW)
    (@ t_av2)
    (f_aj0 [ALWAYS Just L] :: t_av2 -> t_auW -> t_av2)
    (y_aj1 [ALWAYS Just L] :: t_av2)
    (ds_dQg [ALWAYS Just S] :: [t_auW]) ->
    case ds_dQg of wild_XK [ALWAYS Just A] {
      [] -> y_aj1;
      : x_aj5 [ALWAYS Just L] xs_aj6 [ALWAYS Just S] ->
        foldl'_sQN @ t_auW @ t_av2 f_aj0 (f_aj0 y_aj1 x_aj5) xs_aj6
    }


Forcing it inline with {-# INLINE foldl' #-} just specialized it:

foldl'_sSS [ALWAYS LoopBreaker Nothing] :: (GHC.Num.Integer
                                            -> GHC.Num.Integer
                                            -> GHC.Num.Integer)
                                           -> GHC.Num.Integer
                                           -> [GHC.Num.Integer]
                                           -> GHC.Num.Integer
[Arity 3
 Str: DmdType LLS]
foldl'_sSS =
  \ (f_aj0 [ALWAYS Just L] :: GHC.Num.Integer
                              -> GHC.Num.Integer
                              -> GHC.Num.Integer)
    (y_aj1 [ALWAYS Just L] :: GHC.Num.Integer)
    (ds_dQg [ALWAYS Just S] :: [GHC.Num.Integer]) ->
    case ds_dQg of wild_XI [ALWAYS Just A] {
      [] -> y_aj1;
      : x_aj5 [ALWAYS Just L] xs_aj6 [ALWAYS Just S] ->
        foldl'_sSS f_aj0 (f_aj0 y_aj1 x_aj5) xs_aj6
    }


I thought this was interesting.  Is it to be expected?  Am I right in 
interpreting this to mean it was just too much for the strictness 
analyzer.  I believe the first ultimately produces significantly 
superior code, so should one always write their recursive functions such 
that the constant (functional?) parameters are first captured in a closure?

In that vein, would it be useful if the compiler automatically 
transformed the second into the first?

Thanks!  -Tyson


More information about the Glasgow-haskell-users mailing list