[Haskell-cafe] gcc as a portable assembler: tail calls vs trampoline

Loup Vaillant loup.vaillant at gmail.com
Tue Jul 1 11:43:52 EDT 2008


2008/7/1 Robin Green <greenrd at greenrd.org>:
> I think you should use the -march flag to gcc. Otherwise you are tying
> the compiler to an ancient version of the x86 instruction set (386).

OK, let's try... Done. I used -march=nocona. I couldn't do core2 (not
available on my gcc 4.1.2 yet). Did not notice any difference...


> On Tue, 1 Jul 2008 16:57:38 +0200
> "Loup Vaillant" <loup.vaillant at gmail.com> wrote:
>
>> Hello, everyone,
>> I'm not sure this is the good list for this posting: I plan to write
>> yet another lazy graph reduction machine, using gcc, mainly for fun.
>> My concern right now is about one particular aspect of gcc: the tail
>> calls vs trampoline deathmatch.
>>
>> First some clarifications:
>> -> By tail call, I mean a properly optimized one, which doesn't grow
>> the stack. Basically a jump.
>> -> By trampoline, I mean the infinite loop referred to as a "tiny
>> interpreter" in the STG paper [1]. Instead of (tail) calling a
>> function, we return its pointer, which will then be used for the
>> actual call.
>>
>> I've read that trampoline calls are about 3 times as expensive as tail
>> calls. Instead of a (direct) jump, you have a return, followed by an
>> indirect call. However, testing this with gcc, I didn't manage to show
>> any measurable difference. What am I missing? I used -O2 on both
>> benchmarks, and turned off inlining (using the __noinline__
>> attribute). I'm running the program on a core2Duo in 32 bits mode,
>> using GNU/Linux Ubuntu. Should I also make separate modules to prevent
>> whole program analysis from gcc?
>>
>> Anyway, I send you the code below, so you can test and review it.
>> Enjoy!
>>
>> [1]:
>> http://research.microsoft.com/Users/simonpj/Papers/spineless-tagless-gmachine.ps.gz
>>
>>
>>
>> /* This program tests the performance difference between trampoline
>>  * calls and tail calls.  (In the STG and related papers, trampoline
>>  * is referred to as "tiny interpreter")
>>  *
>>  * depending on the mode (tail call or trampoline), some code changes:
>>  * the return type of the code blocks, and the calls to other blocks
>>  * (direct tail calls or return to trampoline). Hence the macros ENTER
>>  * and RETURN.
>>  *
>>  * To compile it,type one of the two following commands:
>>  * gcc -O2 sibcall.c -D SIBCALL
>>  * gcc -O2 sibcall.c
>>  *
>>  * The first one will make the program use sibling calls (enabled at
>>  * O2), and the second one will fall back to trampoline calls
>>  */
>>
>> #ifdef SIBCALL // tail call mode
>>
>> #define ENTER(f) (f())
>> #define RETURN void
>> #define MAIN(start) start()
>>
>> #else // trampoline call mode
>>
>> #define ENTER(f) return f
>> #define RETURN void*
>> #define MAIN(start)  do{ void * (*f)(void) = start; \
>>                          while (1) \
>>                            f = (*f)(); \
>>                        } while (0)
>> #endif
>>
>> // Now, we can begin the generic code
>> #include <stdlib.h>
>> #define LIMIT 1000000000 // one billion
>>
>> int counter = 0;
>>
>> RETURN closure0(void);
>> RETURN closure1(void);
>>
>>
>> int main(int argc, char ** argv)
>> {
>>   MAIN(closure0);
>>   return 0;
>> }
>>
>> __attribute__ ((__noinline__))
>> RETURN closure0(void)
>> {
>>   if (counter >= LIMIT) exit(0);
>>   counter++;
>>   ENTER(closure1);
>> }
>> __attribute__ ((__noinline__))
>> RETURN closure1(void)
>> {
>>   if (counter >= LIMIT) exit(0);
>>   counter++;
>>   ENTER(closure0);
>> }
>> _______________________________________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe at haskell.org
>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>


More information about the Haskell-Cafe mailing list