[Haskell-cafe] gcc as a portable assembler: tail calls vs trampoline
Loup Vaillant
loup.vaillant at gmail.com
Tue Jul 1 10:57:38 EDT 2008
Hello, everyone,
I'm not sure this is the good list for this posting: I plan to write
yet another lazy graph reduction machine, using gcc, mainly for fun.
My concern right now is about one particular aspect of gcc: the tail
calls vs trampoline deathmatch.
First some clarifications:
-> By tail call, I mean a properly optimized one, which doesn't grow
the stack. Basically a jump.
-> By trampoline, I mean the infinite loop referred to as a "tiny
interpreter" in the STG paper [1]. Instead of (tail) calling a
function, we return its pointer, which will then be used for the
actual call.
I've read that trampoline calls are about 3 times as expensive as tail
calls. Instead of a (direct) jump, you have a return, followed by an
indirect call. However, testing this with gcc, I didn't manage to show
any measurable difference. What am I missing? I used -O2 on both
benchmarks, and turned off inlining (using the __noinline__
attribute). I'm running the program on a core2Duo in 32 bits mode,
using GNU/Linux Ubuntu. Should I also make separate modules to prevent
whole program analysis from gcc?
Anyway, I send you the code below, so you can test and review it. Enjoy!
[1]: http://research.microsoft.com/Users/simonpj/Papers/spineless-tagless-gmachine.ps.gz
/* This program tests the performance difference between trampoline
* calls and tail calls. (In the STG and related papers, trampoline
* is referred to as "tiny interpreter")
*
* depending on the mode (tail call or trampoline), some code changes:
* the return type of the code blocks, and the calls to other blocks
* (direct tail calls or return to trampoline). Hence the macros ENTER
* and RETURN.
*
* To compile it,type one of the two following commands:
* gcc -O2 sibcall.c -D SIBCALL
* gcc -O2 sibcall.c
*
* The first one will make the program use sibling calls (enabled at
* O2), and the second one will fall back to trampoline calls
*/
#ifdef SIBCALL // tail call mode
#define ENTER(f) (f())
#define RETURN void
#define MAIN(start) start()
#else // trampoline call mode
#define ENTER(f) return f
#define RETURN void*
#define MAIN(start) do{ void * (*f)(void) = start; \
while (1) \
f = (*f)(); \
} while (0)
#endif
// Now, we can begin the generic code
#include <stdlib.h>
#define LIMIT 1000000000 // one billion
int counter = 0;
RETURN closure0(void);
RETURN closure1(void);
int main(int argc, char ** argv)
{
MAIN(closure0);
return 0;
}
__attribute__ ((__noinline__))
RETURN closure0(void)
{
if (counter >= LIMIT) exit(0);
counter++;
ENTER(closure1);
}
__attribute__ ((__noinline__))
RETURN closure1(void)
{
if (counter >= LIMIT) exit(0);
counter++;
ENTER(closure0);
}
More information about the Haskell-Cafe
mailing list