Loop optimisation with identical counters
dmp at rice.edu
Fri Nov 5 20:28:29 EDT 2010
On Nov 5, 2010, at 6:44 PM, Roman Leshchinskiy wrote:
> On 05/11/2010, at 23:22, David Peixotto wrote:
>> I spent some time looking at the code generated for llvm and the optimizations
>> it can apply. There were quite a bit of details to examine and I wrote it up
>> as blog post here:
> Nice! Thanks a lot for doing that!
My pleasure :)
>> To summarize, I found that it is possible to get LLVM to do this
>> transformation through a combination of tail-call elimination, inlining,
>> induction variable optimization, and global value numbering. This works fine
>> on x86_64 where we can pass parameters in registers, but fails to fully fire
>> in i386 back end because LLVM gets caught up by aliasing problems because
>> function parameters are passed on the stack. The possible aliasing between the
>> stack pointer (Sp) and the function argument pointer (R1) prevented the full
>> transformation, but it was still able to reduce the f loop to straight line code.
> Hmm... IIRC we agreed that Sp is never aliased in GHC-generated code and David Terei (I'm cc'ing here, not sure if he reads the list) made sure to include appropriate annotations in Haskell code. In fact, in your post Sp is passed as i32* noalias nocapture %Sp_Arg. Isn't that enough for LLVM to know that Sp isn't aliased?
Yes, the LLVM code has Sp, Hp, Base all annotated as noalias. I believe that Sp, Hp, and Base should never alias, but a (boxed) R1 should always alias with either Sp or Hp. I had a hard time determining exactly how LLVM uses the noalias annotation, but playing with opt -print-alias-sets I saw that Sp was a MayAlias with the pointers derived from R1. I would guess that casting an int to a pointer (like we do for R1) makes that pointer MayAlias with everything regardless of the noalias annotation.
>> 1. The ability of LLVM to optimize Haskell functions is limited by the calling
>> convention. Particularly for i386, function arguments are passed on a stack
>> that LLVM knows nothing about. The reads and writes to the stack look like
>> arbitrary loads and stores. It has no notion of popping elements from the
>> stack which makes it difficult to know when it is ok to eliminate stores to
>> the stack.
> But shouldn't it just promote stack locations to registers?
Yes, LLVM can and will promote the stack locations to registers, but since it doesn't know that Sp is really a stack, it is difficult for it to tell when it can avoid the writes back to the stack even though *we* know they will not be visible once the function call returns.
>> 2. The possible aliasing introduced by casting integer arguments
>> (R1-R6) to pointers limits the effectiveness of its optimizations.
> Yes, that's a big problem. David tried to solve some of it by including noalias annotations but it's not clear what to do with, say, newly allocated ByteArrays which can't be aliased by anything.
It may profitable to write our own alias analysis pass for LLVM that encodes our knowledge of what can alias in the GHC world view. It wouldn't be useful for other LLVM clients, but might be a good option for us.
> Anyway, it's great to know that there are things we can improve to make LLVM optimise better.
Yeah, I'm generally very impressed with what LLVM is able to do with the code from GHC. Any help we can give it will just make it that much better!
More information about the Glasgow-haskell-users