possible solution! Re: llvm calling convention matters
Geoffrey Mainland
mainland at apeiron.net
Fri Sep 13 02:55:35 UTC 2013
The plan is as I wrote below:
7.8 will only support passing 128-bit SIMD vectors in registers on x86-64.
Other vectors sizes, and all vectors on x86-32, will be passed on the
stack.
There is not enough time for anything else at his point.
Geoff
On 09/12/2013 10:40 PM, Carter Schonwald wrote:
> let me know before the weekend starts.... so i can make time to help
> if need be (unless Austin gives breathing room on merge window for
> such a thing)
>
>
> On Thu, Sep 12, 2013 at 3:03 PM, Carter Schonwald
> <carter.schonwald at gmail.com <mailto:carter.schonwald at gmail.com>> wrote:
>
> emphasis on "very very clear warning"
>
>
> On Thu, Sep 12, 2013 at 3:00 PM, Carter Schonwald
> <carter.schonwald at gmail.com <mailto:carter.schonwald at gmail.com>>
> wrote:
>
> after a bit more reflection: as long as we provide a clear
> warning that 7.8 may at some point no longer work with llvm
> 3.4, i'm down for the change. We just need to make it very
> very clear, that it may stop working. (and have AVX support
> via passing on the stack with <= 3.3)
>
> before i go and upstream that patch, could we benchmark how
> multivector perf fairs with patched llvm? i don't have the
> right hardware for doing the benchmarks you did in your paper...
>
> sorry for being a bit over the top yesterday, i'm just
> juggling a lot right now :)
>
> -Carter
>
>
> On Thu, Sep 12, 2013 at 2:47 PM, Carter Schonwald
> <carter.schonwald at gmail.com
> <mailto:carter.schonwald at gmail.com>> wrote:
>
> oh, i didn't realize you had already done the work! (bah,
> i'm sorry, i feel terrible)
>
> I thought i had communicated ~ a month ago that I was
> worried about release engineering interaction with making
> it impossible to then make a subsequent changes more
> thoughtfully because of the LLVM release cycle. This
> concern of mine balloned a bit after helping triage a huge
> number of problems people were hitting with the Clang
> transition on mac thats underway.
>
> Its actually very easy to package up an llvm with that
> patch, much simpler than "build GHC from source". In fact,
> on OS X, the simplest way to install LLVM by default
> essentially does a build from source.
>
> Geoff, it'd at least be worth running the benchmarks to
> measure the work! (and as I said, i'm happy to help)
>
>
> On Thu, Sep 12, 2013 at 2:30 PM, Geoffrey Mainland
> <mainland at apeiron.net <mailto:mainland at apeiron.net>> wrote:
>
> If users have to do a custom llvm build, we might as
> well ask them to
> build ghc from source too.
>
> Unless I misunderstood ticket #8033, you were
> originally quite gung-ho
> about changing the LLVM calling conventions to support
> passing SIMD
> vectors of all widths in registers on both x86-32 and
> -64, getting these
> patches into LLVM 3.4, and making sure that GHC 7.8
> would support all
> this. I spent several days making sure this could
> happen from the GHC
> side. Now that the plan has changed, I will back out
> that work, and 7.8
> will only support passing 128-bit SIMD vectors in
> registers on x86-64.
> Other vectors sizes, and all vectors on x86-32, will
> be passed on the stack.
>
> Geoff
>
> On 9/12/13 1:32 PM, Carter Schonwald wrote:
> > to repeat:
> >
> > I think no one would have object to having a
> clearly marked,
> > experimental -fllvmExpermentalAVX flag that requires
> building LLVM
> > with a specified patch, as a way to showcase your
> multivector work!
> >
> > that would evade all of my objections (provided avx
> is still exposed
> > with normal -fllvm, but spilled to stack rather than
> registers), and
> > i'd actually argue in favor of such.
> >
> > Especially since it would not impose any release
> cycle constraints on
> > a subsequent, systematic exploration for using XMM /
> YMM / ZMM in the
> > calling convention going forward.
> >
> > @Geoff, Simons, Johan, and others: does anyone
> object to that approach?
> >
> > applying such a calling convention patch to llvm is
> really quite
> > straightforward, and the build process is pretty
> zippy after that too.
> >
> > cheers
> > -Carter
> >
> >
> > On Thu, Sep 12, 2013 at 2:34 AM, Carter Schonwald
> > <carter.schonwald at gmail.com
> <mailto:carter.schonwald at gmail.com>
> <mailto:carter.schonwald at gmail.com
> <mailto:carter.schonwald at gmail.com>>> wrote:
> >
> > that said it does occur to me that there is an
> alternative
> > solution that may be acceptable for everyone!
> >
> > what about providing a pseudo compatible way called
> > -fllvm-experimentalAVX (or something), and
> simply require that for
> > it to be used, the user has an llvm Patched with
> the YMM simd in
> > register fun call support? internally that could
> just be an llvm
> > way that trips the logic that puts the first few
> AVX values in
> > those YMM1-6 slots if they are the first args,
> so only the stack
> > spilling logic needs be changed?
> >
> > (ie it wouldn't be tied to an llvm version, but
> rather this pseduo
> > way flag)
> >
> > does that make sense?
> >
> > either way, i'd really like having avx even if
> its always spilled
> > to stack at funcalls with standard LLVMs!
> >
> > cheers
> > -carter
> >
> >
> >
> >
> > On Thu, Sep 12, 2013 at 2:28 AM, Carter Schonwald
> > <carter.schonwald at gmail.com
> <mailto:carter.schonwald at gmail.com>
> <mailto:carter.schonwald at gmail.com
> <mailto:carter.schonwald at gmail.com>>>
> > wrote:
> >
> > Geoff,
> >
> > a prosaic reason why there *might* be a
> fundamentally breaking
> > change would be the following idea nathan
> howell suggested to
> > me this afternoon: change the Sp and SPLim
> register so that
> > the X86/x86_64 target can use the CPU's Push
> and (maybe) Pop
> > instructions for the stack manipulations,
> rather than MOV and
> > fam. see
> http://ghc.haskell.org/trac/ghc/ticket/8272 (which
> > is just what i've said). Thats one change
> thats pretty simple
> > but deep, but likely worth exploring.
> >
> >
> > i'm saying any ABI change for GHC 7.10,
> would likely entail
> > patching LLVM 3.4, because thats the only
> LLVM version likely
> > to come out between now and whenever we get
> 7.10 out (assuming
> > 7.10 lands within the next 8-12 months,
> which is reasonable
> > since we've got noticeably more (amazing)
> people helping out
> > lately). Thus, any change there entails
> either asking the llvm
> > folks to support >1 GHC convention per
> architecture, or
> > replace the current one! I'd rather do the
> latter than the
> > former, when it comes to asking other people
> to maintain it :)
> > (and llvm engineers do in fact help out
> maintaining that code)
> >
> >
> > have you run a Nofib, or even benchmarks
> restricted to your
> > multivector code, for the current calling
> convention
> > (including the spilling AVX vectors to the
> stack thats the
> > current plan i gather) VS passing in
> registers with an LLVM
> > built using the patches i worked out ~ 2
> months ago? it'd be
> > really easy to build that custom llvm, then
> run the
> > benchmarks! (i'm happy to help, and
> ultimately, benchmarks
> > will reveal if its worth while or not! And
> if the main goal is
> > for your talk, its still valid even if its
> not in the merge
> > window over the next 4 days).
> >
> > I really think its not obvious what the
> "best" abi
> > change would be! It really will require
> coming up with a list
> > of variants, implementing them, and running
> nofib with each
> > variant, which i lack the compute/human time
> resources to do
> > this week. Modern hardware is complex enough
> that for
> > something like an ABI change, the only
> healthy attitude can be
> > "lets benchmark it!".
> >
> > i'd really like any change in calling
> convention to also
> > improve perf on codes that aren't explicitly
> simd! (and a
> > conservative simd only change,
> blocks/conflicts with that
> > augmentation going forward, and not just for
> the stack pointer
> > example i mention early)
> >
> > Not just scalar floats in simd registers ,
> but perhaps also
> > words/ints !
> >
> > (though that latter bit might be pretty
> ambitious and subtle,
> > i'll need to investigate that a bit to see
> how feasible it may
> > be).
> > SIMD has great support for ints/words, and
> any partial abi
> > change on the llvm backend now would make it
> hard to support
> > that later well (or at least, thats what it
> looks like to me).
> > actually effectively using simd for scalar
> ints and words
> > should be doable, but might force us to be a
> bit more
> > thoughtful on how GHC internally
> distinguishes ints used for
> > address arithmetic, vs ints used as data.
> (interestingly, i'm
> > not sure if any current extent x86 calling
> convention does that!)
> >
> >
> > That single change would make 7.10
> require a completely
> > different llvm and native code gen
> convention from our current
> > one, plus touch all of the code gen on x86
> architectures.
> >
> >
> > basically: we're lucky that everyone builds
> haskell code from
> > source, so ABI compat across GHC versions is
> a non issue. BUT,
> > any ABI changes should be backed by
> benchmarks (at least when
> > the change is performance motivated).
> Likewise, because we use
> > LLVM as an external dep for the -fllvm
> backend, we really need
> > to keep how their release cycle interacts
> with our release
> > cycle, because people use haskell and ghc!
> which as many like
> > to say, is both a boon and a pain ;).
> >
> > Having people hit ghc acting broken with an
> llvm that was
> > "supported before" is risky support problem
> to deal with.
> > having an LLVM head variant support a
> modified ABI, and then
> > later needing to break it for 7.10 (for one
> of the possible
> > exploratory reasons above) would lead to a
> support headache I
> > don't wish on anyone.
> >
> > pardon the verbose answer, but thats my
> offhand take
> >
> > cheers
> > -Carter
> >
> >
> > On Wed, Sep 11, 2013 at 10:10 PM, Geoffrey
> Mainland
> > <mainland at apeiron.net
> <mailto:mainland at apeiron.net>
> <mailto:mainland at apeiron.net
> <mailto:mainland at apeiron.net>>> wrote:
> >
> > We support compiling some code with
> -fllvm and some not in
> > the same
> > executable. Otherwise how could users of
> the Haskell
> > Platform link their
> > -fllvm-compiled code with
> native-codegen-compiled
> > libraries like base, etc.?
> >
> > In other words, the LLVM and native back
> ends use the same
> > calling
> > convention. With my SIMD work, they
> still use the same calling
> > conventions, but the native codegen can
> never generate
> > code that uses
> > SIMD instructions.
> >
> > Geoff
> >
> > On 09/11/2013 10:03 PM, Johan Tibell wrote:
> > > OK. But that doesn't create a problem
> for the code we
> > output with the
> > > LLVM backend, no? Or do we support
> compiling some code
> > with -fllvm and
> > > some not in the same executable?
> > >
> > >
> > > On Wed, Sep 11, 2013 at 6:56 PM,
> Geoffrey Mainland
> > > <mainland at apeiron.net
> <mailto:mainland at apeiron.net>
> <mailto:mainland at apeiron.net
> <mailto:mainland at apeiron.net>>
> > <mailto:mainland at apeiron.net
> <mailto:mainland at apeiron.net>
> > <mailto:mainland at apeiron.net
> <mailto:mainland at apeiron.net>>>> wrote:
> > >
> > > We definitely have interop between
> the native
> > codegen and the LLVM
> > > back
> > > end now. Otherwise anyone who
> wanted to use the LLVM
> > back end
> > > would have
> > > to build GHC themselves. Interop
> means that users
> > can install the
> > > Haskell Platform and still use
> -fllvm when it makes
> > a performance
> > > difference.
> > >
> > > Geoff
> > >
> > > On 09/11/2013 07:59 PM, Johan
> Tibell wrote:
> > > > Do nothing different than you're
> doing for 7.8, we
> > can sort it out
> > > > later. Just put a comment on the
> primops saying
> > they're
> > > LLVM-only. See
> > > > e.g.
> > > >
> > > >
> > > >
> > >
> >
> https://github.com/ghc/ghc/blob/master/compiler/prelude/primops.txt.pp#L181
> > > >
> > > > for an example how to add docs
> to primops.
> > > >
> > > > I don't think we need interop
> between the native
> > and the LLVM
> > > > backends. We don't have that now
> do we (i.e. they
> > use different
> > > > calling conventions).
> > > >
> > > >
> > > >
> > > > On Wed, Sep 11, 2013 at 4:51 PM,
> Geoffrey Mainland
> > > > <mainland at apeiron.net
> <mailto:mainland at apeiron.net>
> > <mailto:mainland at apeiron.net
> <mailto:mainland at apeiron.net>>
> <mailto:mainland at apeiron.net <mailto:mainland at apeiron.net>
> > <mailto:mainland at apeiron.net
> <mailto:mainland at apeiron.net>>>
> > > <mailto:mainland at apeiron.net
> <mailto:mainland at apeiron.net>
> > <mailto:mainland at apeiron.net
> <mailto:mainland at apeiron.net>>
> <mailto:mainland at apeiron.net <mailto:mainland at apeiron.net>
> > <mailto:mainland at apeiron.net
> <mailto:mainland at apeiron.net>>>>> wrote:
> > > >
> > > > On 09/11/2013 07:44 PM,
> Johan Tibell wrote:
> > > > > On Wed, Sep 11, 2013 at
> 4:40 PM, Geoffrey
> > Mainland
> > > > <mainland at apeiron.net
> <mailto:mainland at apeiron.net>
> > <mailto:mainland at apeiron.net
> <mailto:mainland at apeiron.net>>
> <mailto:mainland at apeiron.net <mailto:mainland at apeiron.net>
> > <mailto:mainland at apeiron.net
> <mailto:mainland at apeiron.net>>>
> > > <mailto:mainland at apeiron.net
> <mailto:mainland at apeiron.net>
> > <mailto:mainland at apeiron.net
> <mailto:mainland at apeiron.net>>
> <mailto:mainland at apeiron.net <mailto:mainland at apeiron.net>
> > <mailto:mainland at apeiron.net
> <mailto:mainland at apeiron.net>>>>> wrote:
> > > > > > Do you mean we need a
> reasonable emulation
> > of the SIMD
> > > primops for
> > > > > > the native codegen?
> > > > >
> > > > > Yes. Reasonable in the
> sense that it
> > computes the right
> > > result.
> > > > I can
> > > > > see that some code might
> still want to
> > #ifdef (if the
> > > fallback isn't
> > > > > fast enough).
> > > >
> > > > Two implications of this
> requirement:
> > > >
> > > > 1) There will not be SIMD in
> 7.8. I just don't
> > have the
> > > time. In fact,
> > > > what SIMD support is there
> already will have
> > to be removed if we
> > > > cannot
> > > > live with LLVM-only SIMD
> primops.
> > > >
> > > > 2) If we also require
> interop between the LLVM
> > back-end and
> > > the native
> > > > codegen, then we cannot pass
> any SIMD vectors in
> > > registers---they all
> > > > must be passed on the stack.
> > > >
> > > > My plan, as discussed with
> Simon PJ, is to not
> > support SIMD
> > > primops at
> > > > all with the native codegen.
> If there is a
> > strong feeling that
> > > > this *is
> > > > not* the way to go, the I
> need to know ASAP.
> > > >
> > > > Geoff
> > > >
> > > >
> > > >
> > >
> > >
> >
> >
> >
> >
>
>
>
>
>
More information about the ghc-devs
mailing list