simd branch ready for review

Geoffrey Mainland mainland at
Thu Jan 31 12:38:55 CET 2013

Hi Simon,

I've pushed my simd branch to Everything has been
rebased against HEAD. Simon PJ and I looked over the changes together
already, but I wanted to give you (and everyone on ghc-devs) the
opportunity to look things over before I merge to HEAD. Simon PJ and I
came up with a few questions/notes for you, but hopefully nothing that
should delay a merge.

* Win32 issues

  Modern 32-bit x86 *NIX systems align the stack to 16-bytes, but Win32
  aligns only to 4-bytes. LLVM does not assume 16-byte stack
  alignment. Instead, on platforms where 16-byte stack alignment is not
  guaranteed, it 1) always outputs a function prologue that 2) aligns
  the stack to a 16-byte boundary with an "and" instructions, and it
  also 3) disables tail calls. Because LLVM aligns the stack for a
  function that has SSE register spills, it also generates movaps
  instructions (aligned SSE moves) for the spills.

  This makes SSE support on Win32 difficult, and in my opinion not
  worth worrying about.

  The alternative is to 1) patch LLVM to disable the stack-alignment
  code so that we recover the ability to use tail calls and so that ebp
  scribbled over by the prologue and 2) patch the mangler to rewrite
  LLVM's movaps (move aligned) instructions to movups (move unaligned)
  instructions. I have these patches, but they are not included in the
  simd branch.

* How hard would it be to dump ArgRep for PrimRep? It looks
  straightforward. Is it worth doing?

* How hard would it be to track bit width in PrimRep? I recall chatting
  with you once about adding explicit support for, e.g., 8- and 16-bit
  Word/Int primops instead of relying on narrowing. Since SIMD vectors
  need to know the exact bit-width of their elements, I've had to create
  a PrimElemRep data type in compiler/types/TyCon.lhs, but I'd really
  like to be able to re-use PrimRep instead.

* If we replaced all old-style C-- code, could we get rid of the
  explicit STG registers completely? Simon PJ suggested that we use real
  machine registers directly, so, for example, GlobalReg's constructors
  would have FastString fields instead of Int fields.

* Could we add a CmmType field to GlobalReg's constructors? You'll see
  that I added a new XmmReg constructor to GlobalReg, but because I
  don't know the type of an XmmReg, I have to bitcast everywhere in the
  generated LLVM code because LLVM wants to know not just that a value
  is a 16-byte vector, but that it is, e.g., a 16-byte vector containing
  2 64-bit doubles. Having a CmmType attached to a GlobalReg---or
  pairing a GlobalReg with a CmmType when assigning registers---would
  let me avoid all these casts.


More information about the ghc-devs mailing list