jhc vs ghc and the surprising result involving ghc
generatedassembly.
Simon Marlow
simonmar at microsoft.com
Wed Nov 2 05:42:50 EST 2005
On 01 November 2005 16:32, Florian Weimer wrote:
> * Simon Marlow:
>
>> gcc started generating this rubbish around version 3.4, if I recall
>> correctly. I've tried disabling various optimisations, but can't
>> seem to convince gcc not to generate the extra jump. You don't get
>> this from the native code generator, BTW.
>
> But the comparison is present in the C code. What do you want GCC to
> do?
I didn't mean to sound overly critical of gcc. But here's what I was
complaining about - the code generated by gcc (3.4.x) is as follows:
Main_zdwfac_info:
.text
.align 8
.text
movq (%rbp), %rdx
cmpq $1, %rdx
jne .L2
movq 8(%rbp), %r13
leaq 16(%rbp), %rbp
movq (%rbp), %rax
.L4:
jmp *%rax
.L2:
movq %rdx, %rax
imulq 8(%rbp), %rax
movq %rax, 8(%rbp)
leaq -1(%rdx), %rax
movq %rax, (%rbp)
movl $Main_zdwfac_info, %eax
jmp .L4
there's an obvious simplification - the last two instructions should be
replaced by just
jmp Main_zdwfac_info
eliminating one branch and a mov. This occurs all over the place in our
code. Whenever a function has more than one computed goto, gcc insists
on commoning up the jmp instructions even when it's a really bad idea,
like above.
Actually if I add -O2, then I get better code, so perhaps this isn't a
real problem. Although gcc still generates this:
Fac_zdwfac_info:
.text
.align 8
movq (%rbp), %rdx
testq %rdx, %rdx
jne .L3
movq 8(%rbp), %r13
addq $16, %rbp
movq (%rbp), %rax
jmp *%rax
.p2align 4,,7
.L3:
movq 8(%rbp), %rax
imulq %rdx, %rax
decq %rdx
movq %rdx, (%rbp)
movq %rax, 8(%rbp)
movl $Fac_zdwfac_info, %eax
jmp *%rax
and fails to combine the movs with the jmp instruction (we do this
simplification ourselves when post-processing the assembly code). I'll
compile up gcc 4 and see what happens with that.
Cheers,
Simon
More information about the Glasgow-haskell-users
mailing list