simd branch ready for review
mainland at apeiron.net
Thu Jan 31 12:38:55 CET 2013
I've pushed my simd branch to darcs.haskell.org. Everything has been
rebased against HEAD. Simon PJ and I looked over the changes together
already, but I wanted to give you (and everyone on ghc-devs) the
opportunity to look things over before I merge to HEAD. Simon PJ and I
came up with a few questions/notes for you, but hopefully nothing that
should delay a merge.
* Win32 issues
Modern 32-bit x86 *NIX systems align the stack to 16-bytes, but Win32
aligns only to 4-bytes. LLVM does not assume 16-byte stack
alignment. Instead, on platforms where 16-byte stack alignment is not
guaranteed, it 1) always outputs a function prologue that 2) aligns
the stack to a 16-byte boundary with an "and" instructions, and it
also 3) disables tail calls. Because LLVM aligns the stack for a
function that has SSE register spills, it also generates movaps
instructions (aligned SSE moves) for the spills.
This makes SSE support on Win32 difficult, and in my opinion not
worth worrying about.
The alternative is to 1) patch LLVM to disable the stack-alignment
code so that we recover the ability to use tail calls and so that ebp
scribbled over by the prologue and 2) patch the mangler to rewrite
LLVM's movaps (move aligned) instructions to movups (move unaligned)
instructions. I have these patches, but they are not included in the
* How hard would it be to dump ArgRep for PrimRep? It looks
straightforward. Is it worth doing?
* How hard would it be to track bit width in PrimRep? I recall chatting
with you once about adding explicit support for, e.g., 8- and 16-bit
Word/Int primops instead of relying on narrowing. Since SIMD vectors
need to know the exact bit-width of their elements, I've had to create
a PrimElemRep data type in compiler/types/TyCon.lhs, but I'd really
like to be able to re-use PrimRep instead.
* If we replaced all old-style C-- code, could we get rid of the
explicit STG registers completely? Simon PJ suggested that we use real
machine registers directly, so, for example, GlobalReg's constructors
would have FastString fields instead of Int fields.
* Could we add a CmmType field to GlobalReg's constructors? You'll see
that I added a new XmmReg constructor to GlobalReg, but because I
don't know the type of an XmmReg, I have to bitcast everywhere in the
generated LLVM code because LLVM wants to know not just that a value
is a 16-byte vector, but that it is, e.g., a 16-byte vector containing
2 64-bit doubles. Having a CmmType attached to a GlobalReg---or
pairing a GlobalReg with a CmmType when assigning registers---would
let me avoid all these casts.
More information about the ghc-devs