possible solution! Re: llvm calling convention matters
Carter Schonwald
carter.schonwald at gmail.com
Thu Sep 12 19:03:48 UTC 2013
emphasis on "very very clear warning"
On Thu, Sep 12, 2013 at 3:00 PM, Carter Schonwald <
carter.schonwald at gmail.com> wrote:
> after a bit more reflection: as long as we provide a clear warning that
> 7.8 may at some point no longer work with llvm 3.4, i'm down for the
> change. We just need to make it very very clear, that it may stop working.
> (and have AVX support via passing on the stack with <= 3.3)
>
> before i go and upstream that patch, could we benchmark how multivector
> perf fairs with patched llvm? i don't have the right hardware for doing
> the benchmarks you did in your paper...
>
> sorry for being a bit over the top yesterday, i'm just juggling a lot
> right now :)
>
> -Carter
>
>
> On Thu, Sep 12, 2013 at 2:47 PM, Carter Schonwald <
> carter.schonwald at gmail.com> wrote:
>
>> oh, i didn't realize you had already done the work! (bah, i'm sorry, i
>> feel terrible)
>>
>> I thought i had communicated ~ a month ago that I was worried about
>> release engineering interaction with making it impossible to then make a
>> subsequent changes more thoughtfully because of the LLVM release cycle.
>> This concern of mine balloned a bit after helping triage a huge number of
>> problems people were hitting with the Clang transition on mac thats
>> underway.
>>
>> Its actually very easy to package up an llvm with that patch, much
>> simpler than "build GHC from source". In fact, on OS X, the simplest way to
>> install LLVM by default essentially does a build from source.
>>
>> Geoff, it'd at least be worth running the benchmarks to measure the work!
>> (and as I said, i'm happy to help)
>>
>>
>> On Thu, Sep 12, 2013 at 2:30 PM, Geoffrey Mainland <mainland at apeiron.net>wrote:
>>
>>> If users have to do a custom llvm build, we might as well ask them to
>>> build ghc from source too.
>>>
>>> Unless I misunderstood ticket #8033, you were originally quite gung-ho
>>> about changing the LLVM calling conventions to support passing SIMD
>>> vectors of all widths in registers on both x86-32 and -64, getting these
>>> patches into LLVM 3.4, and making sure that GHC 7.8 would support all
>>> this. I spent several days making sure this could happen from the GHC
>>> side. Now that the plan has changed, I will back out that work, and 7.8
>>> will only support passing 128-bit SIMD vectors in registers on x86-64.
>>> Other vectors sizes, and all vectors on x86-32, will be passed on the
>>> stack.
>>>
>>> Geoff
>>>
>>> On 9/12/13 1:32 PM, Carter Schonwald wrote:
>>> > to repeat:
>>> >
>>> > I think no one would have object to having a clearly marked,
>>> > experimental -fllvmExpermentalAVX flag that requires building LLVM
>>> > with a specified patch, as a way to showcase your multivector work!
>>> >
>>> > that would evade all of my objections (provided avx is still exposed
>>> > with normal -fllvm, but spilled to stack rather than registers), and
>>> > i'd actually argue in favor of such.
>>> >
>>> > Especially since it would not impose any release cycle constraints on
>>> > a subsequent, systematic exploration for using XMM / YMM / ZMM in the
>>> > calling convention going forward.
>>> >
>>> > @Geoff, Simons, Johan, and others: does anyone object to that approach?
>>> >
>>> > applying such a calling convention patch to llvm is really quite
>>> > straightforward, and the build process is pretty zippy after that too.
>>> >
>>> > cheers
>>> > -Carter
>>> >
>>> >
>>> > On Thu, Sep 12, 2013 at 2:34 AM, Carter Schonwald
>>> > <carter.schonwald at gmail.com <mailto:carter.schonwald at gmail.com>>
>>> wrote:
>>> >
>>> > that said it does occur to me that there is an alternative
>>> > solution that may be acceptable for everyone!
>>> >
>>> > what about providing a pseudo compatible way called
>>> > -fllvm-experimentalAVX (or something), and simply require that for
>>> > it to be used, the user has an llvm Patched with the YMM simd in
>>> > register fun call support? internally that could just be an llvm
>>> > way that trips the logic that puts the first few AVX values in
>>> > those YMM1-6 slots if they are the first args, so only the stack
>>> > spilling logic needs be changed?
>>> >
>>> > (ie it wouldn't be tied to an llvm version, but rather this pseduo
>>> > way flag)
>>> >
>>> > does that make sense?
>>> >
>>> > either way, i'd really like having avx even if its always spilled
>>> > to stack at funcalls with standard LLVMs!
>>> >
>>> > cheers
>>> > -carter
>>> >
>>> >
>>> >
>>> >
>>> > On Thu, Sep 12, 2013 at 2:28 AM, Carter Schonwald
>>> > <carter.schonwald at gmail.com <mailto:carter.schonwald at gmail.com>>
>>> > wrote:
>>> >
>>> > Geoff,
>>> >
>>> > a prosaic reason why there *might* be a fundamentally breaking
>>> > change would be the following idea nathan howell suggested to
>>> > me this afternoon: change the Sp and SPLim register so that
>>> > the X86/x86_64 target can use the CPU's Push and (maybe) Pop
>>> > instructions for the stack manipulations, rather than MOV and
>>> > fam. see http://ghc.haskell.org/trac/ghc/ticket/8272 (which
>>> > is just what i've said). Thats one change thats pretty simple
>>> > but deep, but likely worth exploring.
>>> >
>>> >
>>> > i'm saying any ABI change for GHC 7.10, would likely entail
>>> > patching LLVM 3.4, because thats the only LLVM version likely
>>> > to come out between now and whenever we get 7.10 out (assuming
>>> > 7.10 lands within the next 8-12 months, which is reasonable
>>> > since we've got noticeably more (amazing) people helping out
>>> > lately). Thus, any change there entails either asking the llvm
>>> > folks to support >1 GHC convention per architecture, or
>>> > replace the current one! I'd rather do the latter than the
>>> > former, when it comes to asking other people to maintain it :)
>>> > (and llvm engineers do in fact help out maintaining that code)
>>> >
>>> >
>>> > have you run a Nofib, or even benchmarks restricted to your
>>> > multivector code, for the current calling convention
>>> > (including the spilling AVX vectors to the stack thats the
>>> > current plan i gather) VS passing in registers with an LLVM
>>> > built using the patches i worked out ~ 2 months ago? it'd be
>>> > really easy to build that custom llvm, then run the
>>> > benchmarks! (i'm happy to help, and ultimately, benchmarks
>>> > will reveal if its worth while or not! And if the main goal is
>>> > for your talk, its still valid even if its not in the merge
>>> > window over the next 4 days).
>>> >
>>> > I really think its not obvious what the "best" abi
>>> > change would be! It really will require coming up with a list
>>> > of variants, implementing them, and running nofib with each
>>> > variant, which i lack the compute/human time resources to do
>>> > this week. Modern hardware is complex enough that for
>>> > something like an ABI change, the only healthy attitude can be
>>> > "lets benchmark it!".
>>> >
>>> > i'd really like any change in calling convention to also
>>> > improve perf on codes that aren't explicitly simd! (and a
>>> > conservative simd only change, blocks/conflicts with that
>>> > augmentation going forward, and not just for the stack pointer
>>> > example i mention early)
>>> >
>>> > Not just scalar floats in simd registers , but perhaps also
>>> > words/ints !
>>> >
>>> > (though that latter bit might be pretty ambitious and subtle,
>>> > i'll need to investigate that a bit to see how feasible it may
>>> > be).
>>> > SIMD has great support for ints/words, and any partial abi
>>> > change on the llvm backend now would make it hard to support
>>> > that later well (or at least, thats what it looks like to me).
>>> > actually effectively using simd for scalar ints and words
>>> > should be doable, but might force us to be a bit more
>>> > thoughtful on how GHC internally distinguishes ints used for
>>> > address arithmetic, vs ints used as data. (interestingly, i'm
>>> > not sure if any current extent x86 calling convention does
>>> that!)
>>> >
>>> >
>>> > That single change would make 7.10 require a completely
>>> > different llvm and native code gen convention from our current
>>> > one, plus touch all of the code gen on x86 architectures.
>>> >
>>> >
>>> > basically: we're lucky that everyone builds haskell code from
>>> > source, so ABI compat across GHC versions is a non issue. BUT,
>>> > any ABI changes should be backed by benchmarks (at least when
>>> > the change is performance motivated). Likewise, because we use
>>> > LLVM as an external dep for the -fllvm backend, we really need
>>> > to keep how their release cycle interacts with our release
>>> > cycle, because people use haskell and ghc! which as many like
>>> > to say, is both a boon and a pain ;).
>>> >
>>> > Having people hit ghc acting broken with an llvm that was
>>> > "supported before" is risky support problem to deal with.
>>> > having an LLVM head variant support a modified ABI, and then
>>> > later needing to break it for 7.10 (for one of the possible
>>> > exploratory reasons above) would lead to a support headache I
>>> > don't wish on anyone.
>>> >
>>> > pardon the verbose answer, but thats my offhand take
>>> >
>>> > cheers
>>> > -Carter
>>> >
>>> >
>>> > On Wed, Sep 11, 2013 at 10:10 PM, Geoffrey Mainland
>>> > <mainland at apeiron.net <mailto:mainland at apeiron.net>> wrote:
>>> >
>>> > We support compiling some code with -fllvm and some not in
>>> > the same
>>> > executable. Otherwise how could users of the Haskell
>>> > Platform link their
>>> > -fllvm-compiled code with native-codegen-compiled
>>> > libraries like base, etc.?
>>> >
>>> > In other words, the LLVM and native back ends use the same
>>> > calling
>>> > convention. With my SIMD work, they still use the same
>>> calling
>>> > conventions, but the native codegen can never generate
>>> > code that uses
>>> > SIMD instructions.
>>> >
>>> > Geoff
>>> >
>>> > On 09/11/2013 10:03 PM, Johan Tibell wrote:
>>> > > OK. But that doesn't create a problem for the code we
>>> > output with the
>>> > > LLVM backend, no? Or do we support compiling some code
>>> > with -fllvm and
>>> > > some not in the same executable?
>>> > >
>>> > >
>>> > > On Wed, Sep 11, 2013 at 6:56 PM, Geoffrey Mainland
>>> > > <mainland at apeiron.net <mailto:mainland at apeiron.net>
>>> > <mailto:mainland at apeiron.net
>>> > <mailto:mainland at apeiron.net>>> wrote:
>>> > >
>>> > > We definitely have interop between the native
>>> > codegen and the LLVM
>>> > > back
>>> > > end now. Otherwise anyone who wanted to use the LLVM
>>> > back end
>>> > > would have
>>> > > to build GHC themselves. Interop means that users
>>> > can install the
>>> > > Haskell Platform and still use -fllvm when it makes
>>> > a performance
>>> > > difference.
>>> > >
>>> > > Geoff
>>> > >
>>> > > On 09/11/2013 07:59 PM, Johan Tibell wrote:
>>> > > > Do nothing different than you're doing for 7.8, we
>>> > can sort it out
>>> > > > later. Just put a comment on the primops saying
>>> > they're
>>> > > LLVM-only. See
>>> > > > e.g.
>>> > > >
>>> > > >
>>> > > >
>>> > >
>>> >
>>> https://github.com/ghc/ghc/blob/master/compiler/prelude/primops.txt.pp#L181
>>> > > >
>>> > > > for an example how to add docs to primops.
>>> > > >
>>> > > > I don't think we need interop between the native
>>> > and the LLVM
>>> > > > backends. We don't have that now do we (i.e. they
>>> > use different
>>> > > > calling conventions).
>>> > > >
>>> > > >
>>> > > >
>>> > > > On Wed, Sep 11, 2013 at 4:51 PM, Geoffrey Mainland
>>> > > > <mainland at apeiron.net
>>> > <mailto:mainland at apeiron.net> <mailto:mainland at apeiron.net
>>> > <mailto:mainland at apeiron.net>>
>>> > > <mailto:mainland at apeiron.net
>>> > <mailto:mainland at apeiron.net> <mailto:mainland at apeiron.net
>>> > <mailto:mainland at apeiron.net>>>> wrote:
>>> > > >
>>> > > > On 09/11/2013 07:44 PM, Johan Tibell wrote:
>>> > > > > On Wed, Sep 11, 2013 at 4:40 PM, Geoffrey
>>> > Mainland
>>> > > > <mainland at apeiron.net
>>> > <mailto:mainland at apeiron.net> <mailto:mainland at apeiron.net
>>> > <mailto:mainland at apeiron.net>>
>>> > > <mailto:mainland at apeiron.net
>>> > <mailto:mainland at apeiron.net> <mailto:mainland at apeiron.net
>>> > <mailto:mainland at apeiron.net>>>> wrote:
>>> > > > > > Do you mean we need a reasonable emulation
>>> > of the SIMD
>>> > > primops for
>>> > > > > > the native codegen?
>>> > > > >
>>> > > > > Yes. Reasonable in the sense that it
>>> > computes the right
>>> > > result.
>>> > > > I can
>>> > > > > see that some code might still want to
>>> > #ifdef (if the
>>> > > fallback isn't
>>> > > > > fast enough).
>>> > > >
>>> > > > Two implications of this requirement:
>>> > > >
>>> > > > 1) There will not be SIMD in 7.8. I just don't
>>> > have the
>>> > > time. In fact,
>>> > > > what SIMD support is there already will have
>>> > to be removed if we
>>> > > > cannot
>>> > > > live with LLVM-only SIMD primops.
>>> > > >
>>> > > > 2) If we also require interop between the LLVM
>>> > back-end and
>>> > > the native
>>> > > > codegen, then we cannot pass any SIMD vectors
>>> in
>>> > > registers---they all
>>> > > > must be passed on the stack.
>>> > > >
>>> > > > My plan, as discussed with Simon PJ, is to not
>>> > support SIMD
>>> > > primops at
>>> > > > all with the native codegen. If there is a
>>> > strong feeling that
>>> > > > this *is
>>> > > > not* the way to go, the I need to know ASAP.
>>> > > >
>>> > > > Geoff
>>> > > >
>>> > > >
>>> > > >
>>> > >
>>> > >
>>> >
>>> >
>>> >
>>> >
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20130912/9b9e0e30/attachment.htm>
More information about the ghc-devs
mailing list