# Recursive functions and constant parameter closures (inlining/strictness analyzer question)

Dan Doel dan.doel at gmail.com
Fri Jun 20 21:34:14 EDT 2008

```On Friday 20 June 2008, Max Bolingbroke wrote:
> Of course, if you have any suggestions for good heuristics based on
> your benchmarking experience then we would like to hear them! There
> was some discussion of this in the original ticket,
> SAT I tried out the suggestions made there without good results
> (though to be perfectly honest I didn't really understand the
> motivations behind some of the suggestions made).

Well, as I said, I was apparently being over-aggressive in my manual
application of the transform, so I'm not yet the best person to ask, I
suppose. :)

However, if I had to pick something out of the air, I'd say this: always do
SAT when the argument in question is a function. This is actually the reason
I started doing it in the first place. I was working on sorting uvectors, and
had what I thought was pretty good performance, and then I did SAT on the
comparison function, and suddenly my code was running in half the time. Going
back to my sorting (testing on introsort), SAT for the array doesn't seem to
do much, but undoing the SAT for the comparison function causes heap
allocation to balloon, and everything gets at least a factor of 2 slower.

This also seems to match the other manual uses of SAT I've seen in the past.
For instance, in the GHC libraries you'll see:

foldr f = go
where go z (x:xs) = f x (go z xs)
go z [    ] = z

because it's significantly faster, while earlier in this thread, SPJ said
that:

xs ++ ys = let go [    ] = ys
go (z:zs) = z : go zs
in go xs

isn't much of a win. I don't have much data currently, but the places I *know*
it's been a win have been SAT on functions, while the places I *know* it's
been a loss have been SAT on unboxed data. Whether this is because SAT allows
more opportunity for inlining/specialization on functions, or because copying
functions is inherently more expensive, though, I don't know.

I'll keep an eye open from now on, though.

(On an additional tentative note, undoing SAT in the first example of my last
mail seemed to eliminate the difference between the native code generator
and -fvia-c -optc-O3. So whatever SAT was doing to make the code slower (no
extra heap allocation, just slower code), GCC may have been fixing. My
off-the-top-of-the-head guess was that passing the arguments along kept them
in registers, and maybe GCC was putting the SATed arguments back into
registers, but as I said, I don't currently have the expertise to verify
that.)

Cheers,
-- Dan
```