Removing/deprecating -fvia-c
Don Stewart
dons at galois.com
Tue Feb 16 12:51:06 EST 2010
marlowsd:
>
> I manged to improve this:
>
> Main_mainzuzdszdwfold_info:
> .Lc1lP:
> addq $32,%r12
> cmpq 144(%r13),%r12
> ja .Lc1lS
> movq %r14,%rax
> cmpq $1000000000,%rax
> jne .Lc1lV
> movq $ghczmprim_GHCziTypes_Dzh_con_info,-24(%r12)
> movsd %xmm6,-16(%r12)
> movq $ghczmprim_GHCziTypes_Dzh_con_info,-8(%r12)
> movsd %xmm5,(%r12)
> leaq -7(%r12),%rbx
> leaq -23(%r12),%r14
> jmp *(%rbp)
> .Lc1lS:
> movq $32,184(%r13)
> movl $Main_mainzuzdszdwfold_closure,%ebx
> addq $-24,%rbp
> movsd %xmm5,(%rbp)
> movsd %xmm6,8(%rbp)
> movq %r14,16(%rbp)
> jmp *-8(%r13)
> .Lc1lV:
> addsd .Ln1m2(%rip),%xmm5
> addsd .Ln1m3(%rip),%xmm6
> leaq 1(%rax),%r14
> addq $-32,%r12
> jmp Main_mainzuzdszdwfold_info
>
>
> from 9 instructions in the last block down to 5 (one instruction fewer
> than gcc). I haven't commoned up the two constant 1's though, that'd
> mean doing some CSE.
>
> On my machine with GHC HEAD and gcc 4.3.0, the gcc version runs in 2.0s,
> with the NCG at 2.3s. I put the difference down to a bit of instruction
> scheduling done by gcc, and that extra constant load.
>
> But let's face it, all of this code is crappy. It should be a tiny
> little loop rather than a tail-call with argument passing, and that's
> what we'll get with the new backend (eventually). LLVM probably won't
> turn it into a loop on its own, that needs to be done before the code
> gets passed to LLVM.
Agreed. Ideally the new backend would be (starting to be?) usable about
the time -fvia-C dies? Otherwise there's always going to be something
that gcc spots that the current codegen won't.
Then again, killing perl from the ghc toolchain, and having a
funeral/dancing on its grave, would be satisfying in itself :-)
> Have you looked at this example on x86? It's *far* worse and runs about
> 5 times slower.
x86 scares me.. :)
More information about the Glasgow-haskell-users
mailing list