Performance on amd64

Tue Jul 5 12:08:47 EDT 2005

On 05 July 2005 16:25, John Skaller wrote:

> On Tue, 2005-07-05 at 12:39 +0100, Simon Marlow wrote:
>> On 05 July 2005 10:38, John Skaller wrote:
>> 
>>>         Can someone comment on the Debian package for Ubuntu Hoary
>>>         providing ghc-6.2.2 with binary for amd64?
>> 
>> You're probably running an unregisterised build, which is going to be
>> generating code at least a factor of 2 slower than a registerised
>> build. You can get an up to date snapshot of 6.4.1 for Linux/x86_64
>> here: 
>> 
>>
http://www.haskell.org/ghc/dist/stable/dist/ghc-6.4.1.20050704-x86_64-un
>> known-linux.tar.bz2
> 
> 
> Thanks, downloading it now.. will try. What exactly is
> a 'registered' build?

An "unregisterised" build generates plain C which is compiled with a C
compiler.  The term "registerised" refers to a set of optimisations
which require post-processing the assembly generated by the C compiler
using a Perl script (affectionately known as the Evil Mangler).  In
particular, registerised code does real tail-calls and uses real machine
registers to store the Hsakell stack and heap pointers.  An
"unregisterised" build is usually the first step when porting GHC to a
new architecture, before support for the "registerised" optimisations is
added.

GHC's native code generators also generate "registerised" code.

>> This build is registerised, but doesn't have the native code
>> generator. 
> 
> Which would generate the best code?

-fvia-C has traditionally produced slightly better code than -fasm, at
least on x86.  On other platforms it might be the other way around.

>> I hope you're not going to conclude *anything* based on the
>> performance of ackermann and tak! :-)
> 
> Ackermann is a good test of optimisation of a recursive
> function, which primarily require the smallest possible
> stack frame. Of course it is only one function, more need
> to be tested.
>
> In fact, this one test has been very good helping me get
> the Felix optimiser to work well -- the raw code generated
> without optimisation creates a heap closure for every function,
> including ones synthesised for conditionals and matches, etc.
> If I remember rightly, it took 2 hours to calculate Ack(3,6),
> and I needed a friend to use a big PPC to get the result
> in two hours for Ack(3,7).
> 
> So ... you could say the Felix optimiser has improved a bit... :)

Sure, it's good to look at these small benchmarks to improve aspects of
our compilers, but we should never claim that results on microbenchmarks
are in any way an indicator of performance on programs that people
actually write.

> If you would like to suggest other tests I'd be quite interested.
> At the moment I'm using code from the Alioth Shootout,
> simply because I can -- saves writing things in languages
> I don't know (which includes Haskell unfortunately).

The shootout has lots of good benchmarks, for sure.  Don't restrict
yourself to the small programs, though.

It's still hard to get a big picture from the results - there are too
many variables. I believe many of the Haskell programs in the suite can
go several times faster with the right tweaks, and using the right
libraries (such as a decent PackedString library).

Cheers,
	Simon