Unwanted eta-expansion

Sat Oct 8 18:32:18 CEST 2011

On Tue, Oct 4, 2011 at 2:39 AM, Roman Cheplyaka <roma at ro-che.info> wrote:
> Suppose I want a foldl which is evaluated many times on the same
> list but with different folding functions.
>
> I would write something like this, to perform pattern-matching on the
> list only once:
>
>    module F where
>    myFoldl :: [a] -> (b -> a -> b) -> b -> b
>    myFoldl [] = \f a -> a
>    myFoldl (x:xs) = let y = myFoldl xs in \f a -> y f (f a x)
>
> However, for some reason GHC eta-expands it back.

It seems to be a common misconception that eta-abstracting your
functions in this way will speed up or otherwise improve your code.

Simon PJ has already provided a good explanation of why GHC eta
expands.  Let me take another tack and describe why the code you wrote
without eta expansion probably doesn't provide you with any actual
benefit.  Roughly speaking, you're creating a chain of closures whose
contents exactly describe the contents of your list (ie you've created
something that's isomorphic to your original list structure), and so
you should expect no benefit at all.

Let's consider a specific concrete call to myFoldl:

myFoldl (1:2:3:4:[]) = r1
  where
    rt = \ f a -> a
    x4 = 4
    r4 = \ f a -> rt f (f a x4)
    x3 = 3
    r3 = \ f a -> r4 f (f a x3)
    x2 = 2
    r2 = \ f a -> r3 f (f a x2)
    x1 = 1
    r1 = \ f a -> r2 f (f a x1)

Each of the bindings above is a separate heap-allocated object.  Now
consider the data representation for the function closures r1..r4.
Each such closure has two free variables: the previous closure (r2..r4
or rt) and the value of the current list element (x1..x4).  If you
write that out schematically in box-and-pointer notation, you'll
quickly see that the result has the exact same memory structure as
your original list.

Now, you'll probably want to point out that transforming your list
into a string of closures like this eliminates the need to pattern
match on the list structure.  This is true, but that's because you've
replaced that pattern match with a call to an unknown closure.  That's
because in most circumstances we'll get just one copy of the code for
r1..r4.  GHC will actually generate code a bit like the following [*]:

myFoldl (1:2:3:4:[]) = r1
  where
    closure x r = \ f a -> r f (f a x)
    rt = \ f a -> a
    x4 = 4
    r4 = closure x4 rt
    x3 = 3
    r3 = closure x3 r4
    x2 = 2
    r2 = closure x2 r3
    x1 = 1
    r1 = closure x1 r2

So there are two different kinds of closure that get passed to r in closure x r:
1) rt
2) a call to closure xn r(n+1)

Distinguishing these two cases (even if just by branching to a closure
code pointer) leads to overheads comparable to (and generally larger
than) those of pattern matching.  GHC used to distinguish (:) and []
by branching to an unknown function pointer (exactly as is happening
here) and switched to pointer tagging instead because it was faster.

All this assumes everything has been completely evaluated already.
Laziness drowns out most of the relative advantages and disadvantages
here (and there's an even-more-involved explanation of why you might
lose strictness information in your eta-abstracted function, making
things worse still).

It also assumes you are not able to specialize your code to a
particular higher-order function.  Any time you can do that, it's
potentially very beneficial.  For example, the following *might* be a
worthwhile definition of foldl:

{-# INLINE foldl #-}
foldl f = loop
  where loop a [] = a
            loop a (x:xs) = loop (f a x) xs

-Jan-Willem Maessen

[*] Note that GHC actually treats the free variables of the closure (x
and r in this case) a little bit specially, so the code isn't
necessarily *literally* equivalent to what I've shown here, but it's
pretty close.