[Haskell-cafe] redundant loads and saves in code generated for recursive functions?

Lemmih lemmih at gmail.com
Wed Jul 30 18:08:49 UTC 2014


Interestingly enough, if you run ‘opt’ on the llvm code before compiling it with ‘llc’, you get the optimised code you would expect.

On Thursday 31 July 2014 at 01:37, Jyotirmoy Bhattacharya wrote:

> I have filed a bug
> https://ghc.haskell.org/trac/ghc/ticket/9387
>  
>  
> On Wed, Jul 30, 2014 at 6:03 PM, Johan Tibell <johan.tibell at gmail.com (mailto:johan.tibell at gmail.com)> wrote:
> > I see. I think this is worthwhile to file a bug for. Please include
> > the cmm, asm, and llvm as attachedment (and perhaps a short Haskell
> > program that show the issue).
> >  
> > On Wed, Jul 30, 2014 at 2:10 PM, Jyotirmoy Bhattacharya
> > <jyotirmoy at jyotirmoy.net (mailto:jyotirmoy at jyotirmoy.net)> wrote:
> > > Hi,
> > >
> > > It doesn't seem the same to me.
> > >
> > > Unlike the bug you point to, the C-- does not have any extra stores. The
> > > stores and loads appear first in the LLVM. I am attaching the C--, LLVM and
> > > assembly codes for the function.
> > >
> > > The real missed opportunity seems to me the absence of a recognition that we
> > > are in fact making a tail call to ourselves. Recognizing that might allow
> > > jumping to some point after the initial stores.
> > >
> > > Jyotirmoy Bhattacharya
> > >
> > >
> > > On Wed, Jul 30, 2014 at 4:14 PM, Johan Tibell <johan.tibell at gmail.com (mailto:johan.tibell at gmail.com)>
> > > wrote:
> > >>
> > >> Hi Jyotirmoy,
> > >>
> > >> I didn't read your assembly carefully, but it sounds similar to
> > >> https://ghc.haskell.org/trac/ghc/ticket/8905, which is not fixed yet.
> > >>
> > >> On Wed, Jul 30, 2014 at 12:03 PM, Jyotirmoy Bhattacharya
> > >> <jyotirmoy at jyotirmoy.net (mailto:jyotirmoy at jyotirmoy.net)> wrote:
> > >> > On reading this again I realise that I got the order of loads and stores
> > >> > wrong. The arguments are being stored on entering the function and
> > >> > loaded
> > >> > before the call. But still, is there a chance of eliminating this
> > >> > redundancy?
> > >> >
> > >> > Jyotirmoy
> > >> >
> > >> >
> > >> > On Wed, Jul 30, 2014 at 1:54 PM, Jyotirmoy Bhattacharya
> > >> > <jyotirmoy at jyotirmoy.net (mailto:jyotirmoy at jyotirmoy.net)> wrote:
> > >> >>
> > >> >> Dear All,
> > >> >>
> > >> >> I am new to Haskell so please forgive me if I am asking about something
> > >> >> already well-understood.
> > >> >>
> > >> >> I was trying to understand the performance of my Haskell program
> > >> >> compiled
> > >> >> with the LLVM backend. I used -ddump-llvm to dump the LLVM assembly and
> > >> >> then
> > >> >> ran llc -O3 on the resulting file to look at the native assembly.
> > >> >>
> > >> >> One of the generated function starts off with
> > >> >> s5BH_info:                              # @s5BH_info
> > >> >> # BB#0:
> > >> >>         subq    $208, %rsp
> > >> >>         movq    %r13, 200(%rsp)
> > >> >>         movq    %rbp, 192(%rsp)
> > >> >>         movq    %r12, 184(%rsp)
> > >> >>         movq    %rbx, 176(%rsp)
> > >> >>         movq    %r14, 168(%rsp)
> > >> >>         movq    %rsi, 160(%rsp)
> > >> >>         movq    %rdi, 152(%rsp)
> > >> >>         movq    %r8, 144(%rsp)
> > >> >>         movq    %r9, 136(%rsp)
> > >> >>         movq    %r15, 128(%rsp)
> > >> >>         movss   %xmm1, 124(%rsp)
> > >> >>         movss   %xmm2, 120(%rsp)
> > >> >>         movss   %xmm3, 116(%rsp)
> > >> >>         movss   %xmm4, 112(%rsp)
> > >> >>         movsd   %xmm5, 104(%rsp)
> > >> >>         movsd   %xmm6, 96(%rsp)
> > >> >>
> > >> >> At some point down the line the function makes a tail call to itself
> > >> >> and
> > >> >> this is the code generated
> > >> >>         movq    %r14, 168(%rsp)
> > >> >>         movq    200(%rsp), %r13
> > >> >>         movq    192(%rsp), %rbp
> > >> >>         movq    184(%rsp), %r12
> > >> >>         movq    176(%rsp), %rbx
> > >> >>         movq    128(%rsp), %r15
> > >> >>         movsd   104(%rsp), %xmm5
> > >> >>         addq    $208, %rsp
> > >> >>         jmp     s5BH_info
> > >> >>
> > >> >> So it looks like some values are being moved from registers to the
> > >> >> stack
> > >> >> only to be immediately moved from the stack to the register on entry to
> > >> >> the
> > >> >> function. It should be possible to eliminate both the load and the
> > >> >> stores.
> > >> >>
> > >> >> Is this behaviour due to LLVM or GHC? If it is GHC, it this an
> > >> >> optimization a newcomer can attempt to implement or are there deep
> > >> >> issues
> > >> >> here?
> > >> >>
> > >> >> Jyotirmoy Bhattacharya
> > >> >>
> > >> >>
> > >> >
> > >> >
> > >> > _______________________________________________
> > >> > Haskell-Cafe mailing list
> > >> > Haskell-Cafe at haskell.org (mailto:Haskell-Cafe at haskell.org)
> > >> > http://www.haskell.org/mailman/listinfo/haskell-cafe
> > >> >
> > >
> > >
>  
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org (mailto:Haskell-Cafe at haskell.org)
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>  
>  


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20140731/dc011c9e/attachment.html>


More information about the Haskell-Cafe mailing list