simd branch ready for review/merge

Simon Peyton-Jones simonpj at
Fri Sep 20 11:24:06 UTC 2013


I'm too far from this stuff to give it a meaningful review, at least not without sitting beside you.  So I suggest you just merge it!  Simon Marlow may want to look.

The wiki page describes the design, and I think it's up to date with your patches (correct?). Thanks for doing that!

>From our previous discussion, the bit I hate is this:

1 there are so many distinct data types (Int16x4, Int32x2, etc)

2 primops.txt.pp therefore has to grow a macro-like mechanism
  to ameliorate the burden of writing out all the zillions of 
  types and primops

Concerning (2), the obvious rejoinder is: well, primops.txt.pp is really a program written in a domain specific language -- and that language is getting more elaborate.  Solution: stop building a new language, and instead make primops.txt.pp into an embedded DSL in Haskell; just a Haskell program that we run to generate the various outputs.  Then all the mechanisms you had to add will be trivial.  

Concerning (1) what we want is a way to make types Int<16,4> where the parameters 16 and 4 are forced to be static literals, and where you absolutely do not get polymorphism like  f :: Int<a><b> -> blah.  There is some Trac discussion about this.  

It can't be that hard.  I'm copying some FC friends!


| -----Original Message-----
| From: Geoffrey Mainland [mailto:mainland at]
| Sent: 16 September 2013 20:17
| To: Simon Peyton-Jones; Simon Marlow; Austin Seipp; ghc-devs at
| Subject: simd branch ready for review/merge
| The SIMD branch, available as wip/simd, is ready for review/merge. It
| could use some review---Simon and Simon, I'd be especially grateful if
| you both had a quick look. Some major points:
| 1) I have added support for AVX 512, although this is necessarily
| untested. AVX and AVX2 are also both supported.
| 2) After the recent churn regarding patching LLVM's GHC calling
| convention, by default only 128-bit wide SIMD vectors are passed in
| registers, and then only on X86_64. There is a "hidden" flag,
| -fllvm-pass-vectors-in-regs, that causes GHC to generate LLVM code that
| assumes all vectors are passed in registers by LLVM. This can be used
| with a suitably patched version of LLVM, and if we get LLVM 3.4 patched,
| we can consider turning it on by default for LLVM 3.4+. This would mean
| that we couldn't mix LLVM <3.3-compiled object files with LLVM
| >3.4-compiled object files, but I don't see that as much of a problem.
| 3) utils/genprimcode has been hacked up to allow us to write vector
| operations once and have them instantiated at multiple vector types. I'm
| not thrilled with this solution, but after discussing with Simon PJ,
| what I've implemented seems to be the minimal reasonable solution to the
| problem of exploding primop boilerplate. The changes are documented in
| compiler/prelude/primops.txt.pp.
| 4) Error handling is sub-optimal. My patch checks to make sure that
| vector primops can be compiled efficiently based on the current set of
| dynamic flags. For example, if -mavx is not specified and the user tries
| to use a primop that adds together two 256-bit wide vectors of
| double-precision elements, the user will see an error message like:
| ghc-stage2: sorry! (unimplemented feature or known bug)
|   (GHC version 7.7.20130916 for x86_64-unknown-linux):
|     256-bit wide floating point SIMD vector instructions require at
| least -mavx.
| This is because the only good place to check for this kind of error is
| during STG->Cmm translation (in compiler/codeGen/StgCmmPrim.hs), and we
| don't have much of an error handling infrastructure there in contrast to
| when we're working in the typechecking/renaming monad. If there is a
| better way/place to do this, please let me know.
| Thanks,
| Geoff

