Recursive functions and constant parameter closures (inlining/strictness analyzer question)

Fri Jun 20 21:34:14 EDT 2008

On Friday 20 June 2008, Max Bolingbroke wrote:
> Of course, if you have any suggestions for good heuristics based on
> your benchmarking experience then we would like to hear them! There
> was some discussion of this in the original ticket,
> http://hackage.haskell.org/trac/ghc/ticket/888, but when implementing
> SAT I tried out the suggestions made there without good results
> (though to be perfectly honest I didn't really understand the
> motivations behind some of the suggestions made).

Well, as I said, I was apparently being over-aggressive in my manual 
application of the transform, so I'm not yet the best person to ask, I 
suppose. :)

However, if I had to pick something out of the air, I'd say this: always do 
SAT when the argument in question is a function. This is actually the reason 
I started doing it in the first place. I was working on sorting uvectors, and 
had what I thought was pretty good performance, and then I did SAT on the 
comparison function, and suddenly my code was running in half the time. Going 
back to my sorting (testing on introsort), SAT for the array doesn't seem to 
do much, but undoing the SAT for the comparison function causes heap 
allocation to balloon, and everything gets at least a factor of 2 slower.

This also seems to match the other manual uses of SAT I've seen in the past. 
For instance, in the GHC libraries you'll see:

    foldr f = go
      where go z (x:xs) = f x (go z xs)
            go z [    ] = z

because it's significantly faster, while earlier in this thread, SPJ said 
that:

    xs ++ ys = let go [    ] = ys
                   go (z:zs) = z : go zs
               in go xs

isn't much of a win. I don't have much data currently, but the places I *know* 
it's been a win have been SAT on functions, while the places I *know* it's 
been a loss have been SAT on unboxed data. Whether this is because SAT allows 
more opportunity for inlining/specialization on functions, or because copying 
functions is inherently more expensive, though, I don't know.

I'll keep an eye open from now on, though.

(On an additional tentative note, undoing SAT in the first example of my last 
mail seemed to eliminate the difference between the native code generator 
and -fvia-c -optc-O3. So whatever SAT was doing to make the code slower (no 
extra heap allocation, just slower code), GCC may have been fixing. My 
off-the-top-of-the-head guess was that passing the arguments along kept them 
in registers, and maybe GCC was putting the SATed arguments back into 
registers, but as I said, I don't currently have the expertise to verify 
that.)

Cheers,
-- Dan