llvm calling convention matters
mainland at apeiron.net
Wed Sep 11 22:59:08 UTC 2013
Can you provide an example of the kind of ABI change you might want for
7.10? Is it mainly using more registers to pass arguments? We're already
using 6 *mm* registers to pass arguments on x86_64. I don't know for
sure, but I would be very surprised if there is code out there that
would benefit greatly from passing more than 6 Float/Double/SIMD vector
arguments in registers.
Without understanding the ABI design space you have in mind, I can't
comment on how changing the ABI now would or would not make future
exploration more difficult.
I don't see why we should limit ourselves by insisting that the gap
between what the LLVM back-end and the native back-end not grow further.
If we want SIMD, the gap is already quite large. Yes it would be nice to
have feature parity, but there are only so many man-hours available, and
we want to invest them wisely. The SIMD primops already do not work on
the native codegen; the user gets an error telling them to use the LLVM
back-end if they use the SIMD primops with the native codegen.
I was not suggesting that we require LLVM 3.4 or later for this or any
future version of GHC. Instead, the ABI would change based on the
version of LLVM used. I think that is unavoidable at this point and not
a huge deal as it would only affect SIMD code.
All this said, I'm not going to push. Changing the ABI just creates more
work for me. I'm very motivated to get the rest of the SIMD patches into
HEAD before I present our SIMD paper at ICFP in a few weeks. However, a
year from now my priorities will likely be very different, so the ball
will be entirely in your (or someone else's, just not my!) court.
On 09/11/2013 06:26 PM, Carter Schonwald wrote:
> hey all,
> first let me preface by saying I am in favor of breaking and
> updating/modernizing the GHC ABI.
> I just think that for a number of reasons, it doesn't make sense to do
> it for the 7.8 release, but rather start work on it in another month
> or so, so we can systematically have a better set of ABI, and keep all
> the code gens are first class citizens. (also work out the type system
> changes need to be able to correctly use SIMD shuffles, which are
> currently inexpressible correctly with GHC's type system. Simd
> Shuffles are crucial for interesting levels of SIMD performance!)
> the reason I don't want to make the ABI change right now is because
> then we'd have to wait until after llvm 3.4 gets released in like 6
> months before giving them another breaking change!
> (OR start baking a LLVM into GHC, which is a leap we're not 100% on,
> though theres clear good reasons for why! ).
> Basically, if we make breaking changes to the ABI now (and thus have
> split ABI for llvm 3.4HEAD vs earlier), and then we do fixups or more
> breakage for 7.10, then when 7.10 rolls around (perhaps late next
> spring or sometime in the summer, perhaps?), the only supported llvm
> version for 7.10 would be LLVM HEAD / 3.5 (which won't be released
> till some time thereafter)! Unless we go ahead and break the 3.4 ABI
> to 7.10 rather than 7.8 abi (whatever that would entai, which would ).
> This is assuming the ~ 7-8 months between major version releases
> cycle that LLVM has done of late
> additionally, as Johan remarked today on a pending patch of mine,
> having operations only work on the llvm backend, and not on the native
> code gen is pretty problematical! see
> tl;dr : Unless we're throwing away native code gen backend next month,
> we probably want to actually not increase their capability gap /
> current ABI incompatibility right before 7.8 release. I am willing to
> help explore modernizing the native code gens so that they have parity
> with the llvm backends. Additionally, boxing ourselves in a corner
> where for 7.10 the only llvm with the right ABI will be llvm 3.5 seems
> totally unacceptable from an end users / distribution package managers
> standpoint, and a huge support headache for the community.
> I've had to help deal with the support headache of the xcode5 clang +
> ghc issues on OS X, A LOT, in the past 2 months, I'm not keen on
> deliberately creating similar support disasters for myself and others.
> that said: I absolutely agree that we should fix up the ABI, have a
> clear story for XMM, YMM, and ZMM registers, and if you've been
> following trac tickets at all, you'll see theres even a type system
> issue in properly handling the SIMD shuffles! i briefly sketch out the
> issue in http://ghc.haskell.org/trac/ghc/ticket/8107 (last comment)
> that said: i'm open to being convinced i'm wrong, and I absolutely
> understand your motivations for wanting it now, but I really believe
> that doing so right now will create a number of problems that are
> better off evaded to begin with
> On Wed, Sep 11, 2013 at 5:49 PM, Geoffrey Mainland
> <mainland at cs.drexel.edu <mailto:mainland at cs.drexel.edu>> wrote:
> Hi Carter,
> On 09/06/2013 03:24 PM, Carter Tazio Schonwald wrote:
> > Hey Geoff,
> > I'm leary about doing a calling convention change right before
> the ghc
> > release (and I"m happy to elaborate more on the phone some time) 1)
> > I'd rather we test the patches on llvm locally ourselves before
> > upstream 2) doing that AVX change on the calling convention now,
> > make it harder to make a more systematic exploration of calling
> > convention changes post 7.8 release, because we would face either
> > breaking the llvm head/3.4 changes, or having to wait till the next
> > llvm release cycle (3.5?!) to upstream any more systematic
> > changes. (such as adding substantially more SIMD registers to
> the GHC
> > calling convention!)
> > I understand your likely motivation for wanting the calling
> > landing in the 7.8 release, namely it may eke an easy 2x perf
> boost in
> > your stream fusion libs, i just worry that the change would
> > cut off our ability to do more aggressive experimentation and
> > improvements (eg more simd registers!) for ghc 7.10 over the next
> > year?
> > on an unrelated note: I will spend some time this weekend given you
> > the various simd operations I want / think are valuable. the low
> > hanging fruit would be figuring out a good haskell type /
> analogue of
> > the llvm __builtin_shuffle(a,b,c) primop, because that usually
> > generate decent code. I'll work out the details of this and some
> > examples and send it your way in the next few days
> > -Carter
> Currently, on x86-64 we pass floats, doubles, and 128-bit wide SIMD
> vectors in xmm1-xmm6. I propose that we change the calling conventions
> to pass 256-bit wide SIMD vectors in ymm1-ymm6 and 512-bit wide SIMD
> vectors in zmm1-zmm6. I don't know why GHC doesn't use xmm0 or
> xmm7, as
> the Linux C calling convention uses xmm0-xmm7. Simon, perhaps you know
> why? I get that we only needed 6 registers originally, F1-F4, D1-D2),
> but why count from one rather than zero?
> On x86-32, we pass floats, double, and all SIMD vectors on the
> stack. I
> propose that we pass 128-bit wide SIMD vectors in xmm0-xmm2, and make
> analogous arrangements for 256- and 512-bit SIMD vectors. We will
> pass floats and doubles on the stack. This matches the Linux x86
> C calling convention.
> I think these are fairly conservative changes. I also don't think we
> should be afraid of revising the calling convention for GHC 7.10.
> the LLVM folks won't be upset if we send them one set of patches a
> instead of one set of patches every two years.
More information about the ghc-devs