simd branch ready for review

Geoffrey Mainland mainland at
Thu Jan 31 18:52:18 CET 2013

On 01/31/2013 12:56 PM, Simon Marlow wrote:
> On 31/01/13 11:38, Geoffrey Mainland wrote:
>> I've pushed my simd branch to Everything has been
>> rebased against HEAD. Simon PJ and I looked over the changes together
>> already, but I wanted to give you (and everyone on ghc-devs) the
>> opportunity to look things over before I merge to HEAD. Simon PJ and I
>> came up with a few questions/notes for you, but hopefully nothing that
>> should delay a merge.
> I'm happy for these to go in - we've already discussed the design a
> few times, and you've incorporated changes we agreed before, so as far
> as I'm concerned it's all good. Go for it!


>> * Win32 issues
>> Modern 32-bit x86 *NIX systems align the stack to 16-bytes, but Win32
>> aligns only to 4-bytes. LLVM does not assume 16-byte stack
>> alignment. Instead, on platforms where 16-byte stack alignment is not
>> guaranteed, it 1) always outputs a function prologue that 2) aligns
>> the stack to a 16-byte boundary with an "and" instructions, and it
>> also 3) disables tail calls. Because LLVM aligns the stack for a
>> function that has SSE register spills, it also generates movaps
>> instructions (aligned SSE moves) for the spills.
> I must be misunderstanding your use of "always" above, because that
> would imply that the LLVM backend doesn't work on Win32 at all. Maybe
> LLVM only aligns the stack when it needs to store SSE values?

You are correct---the stack-aligning prologue is only added by LLVM when
SSE values are written to the stack, so this wasn't a problem before we
had SSE support.

>> This makes SSE support on Win32 difficult, and in my opinion not
>> worth worrying about.
>> The alternative is to 1) patch LLVM to disable the stack-alignment
>> code so that we recover the ability to use tail calls and so that ebp
>> scribbled over by the prologue and 2) patch the mangler to rewrite
>> LLVM's movaps (move aligned) instructions to movups (move unaligned)
>> instructions. I have these patches, but they are not included in the
>> simd branch.
> I don't have an opinion here - maybe ask David T what he'd prefer.

Requiring an LLVM hack seems pretty bad, and David yelled when I changed
the mangler since he wants to get rid of it eventually. My patches are
still around, so if we decide Win32 support is important, I can always
add the changes.

>> * Could we add a CmmType field to GlobalReg's constructors? You'll see
>> that I added a new XmmReg constructor to GlobalReg, but because I
>> don't know the type of an XmmReg, I have to bitcast everywhere in the
>> generated LLVM code because LLVM wants to know not just that a value
>> is a 16-byte vector, but that it is, e.g., a 16-byte vector containing
>> 2 64-bit doubles. Having a CmmType attached to a GlobalReg---or
>> pairing a GlobalReg with a CmmType when assigning registers---would
>> let me avoid all these casts.
> We already have a function
> globalRegType :: DynFlags -> GlobalReg -> CmmType
> so I see that you're guessing in the case of XmmReg. Why not just add
> the necessary information to XmmReg so that you don't have to guess in
> globalRegType?

There doesn't seem to be a clear best choice for this extra info. A
CmmType seems reasonable, and if I'm adding a CmmType to XmmReg, why not
add it everywhere and simplify globalRegType? I'll go ahead and stick
with what I have now.

Thanks for all your answers.


More information about the ghc-devs mailing list