possible solution! Re: llvm calling convention matters

Fri Sep 13 04:02:16 UTC 2013

ok,

On Thu, Sep 12, 2013 at 10:55 PM, Geoffrey Mainland <mainland at apeiron.net>wrote:

> The plan is as I wrote below:
>
>   7.8 will only support passing 128-bit SIMD vectors in registers on
> x86-64.
>   Other vectors sizes, and all vectors on x86-32, will be passed on the
> stack.
>
> There is not enough time for anything else at his point.
>
> Geoff
>
> On 09/12/2013 10:40 PM, Carter Schonwald wrote:
> > let me know before the weekend starts.... so i can make time to help
> > if need be (unless Austin gives breathing room on merge window for
> > such a thing)
> >
> >
> > On Thu, Sep 12, 2013 at 3:03 PM, Carter Schonwald
> > <carter.schonwald at gmail.com <mailto:carter.schonwald at gmail.com>> wrote:
> >
> >     emphasis on "very very clear warning"
> >
> >
> >     On Thu, Sep 12, 2013 at 3:00 PM, Carter Schonwald
> >     <carter.schonwald at gmail.com <mailto:carter.schonwald at gmail.com>>
> >     wrote:
> >
> >         after a bit more reflection: as long as we provide a clear
> >         warning that 7.8 may at some point no longer work with llvm
> >         3.4, i'm down for the change. We just need to make it very
> >         very clear, that it may stop working. (and have AVX support
> >         via passing on the stack with <= 3.3)
> >
> >         before i go and upstream that patch, could we benchmark how
> >         multivector perf fairs with  patched llvm? i don't have the
> >         right hardware for doing the benchmarks you did in your paper...
> >
> >         sorry for being a bit over the top yesterday, i'm just
> >         juggling a lot right now :)
> >
> >         -Carter
> >
> >
> >         On Thu, Sep 12, 2013 at 2:47 PM, Carter Schonwald
> >         <carter.schonwald at gmail.com
> >         <mailto:carter.schonwald at gmail.com>> wrote:
> >
> >             oh, i didn't realize you had already done the work! (bah,
> >             i'm sorry, i feel terrible)
> >
> >             I thought i had communicated ~ a month ago that I was
> >             worried about release engineering interaction with making
> >             it impossible to then make a subsequent changes more
> >             thoughtfully because of the LLVM release cycle. This
> >             concern of mine balloned a bit after helping triage a huge
> >             number of problems people were hitting with the Clang
> >             transition on mac thats underway.
> >
> >             Its actually very easy to package up an llvm with that
> >             patch, much simpler than "build GHC from source". In fact,
> >             on OS X, the simplest way to install LLVM by default
> >             essentially does a build from source.
> >
> >             Geoff, it'd at least be worth running the benchmarks to
> >             measure the work! (and as I said, i'm happy to help)
> >
> >
> >             On Thu, Sep 12, 2013 at 2:30 PM, Geoffrey Mainland
> >             <mainland at apeiron.net <mailto:mainland at apeiron.net>> wrote:
> >
> >                 If users have to do a custom llvm build, we might as
> >                 well ask them to
> >                 build ghc from source too.
> >
> >                 Unless I misunderstood ticket #8033, you were
> >                 originally quite gung-ho
> >                 about changing the LLVM calling conventions to support
> >                 passing SIMD
> >                 vectors of all widths in registers on both x86-32 and
> >                 -64, getting these
> >                 patches into LLVM 3.4, and making sure that GHC 7.8
> >                 would support all
> >                 this. I spent several days making sure this could
> >                 happen from the GHC
> >                 side. Now that the plan has changed, I will back out
> >                 that work, and 7.8
> >                 will only support passing 128-bit SIMD vectors in
> >                 registers on x86-64.
> >                 Other vectors sizes, and all vectors on x86-32, will
> >                 be passed on the stack.
> >
> >                 Geoff
> >
> >                 On 9/12/13 1:32 PM, Carter Schonwald wrote:
> >                 > to repeat:
> >                 >
> >                 >  I think no one would have object to having a
> >                 clearly marked,
> >                 > experimental -fllvmExpermentalAVX flag that requires
> >                 building LLVM
> >                 > with a specified patch, as a way to showcase your
> >                 multivector work!
> >                 >
> >                 > that would evade all of my objections (provided avx
> >                 is still exposed
> >                 > with normal -fllvm, but spilled to stack rather than
> >                 registers), and
> >                 > i'd actually argue in favor of such.
> >                 >
> >                 > Especially since it would not impose any release
> >                 cycle constraints  on
> >                 > a subsequent, systematic exploration for using XMM /
> >                 YMM / ZMM  in the
> >                 > calling convention going forward.
> >                 >
> >                 > @Geoff, Simons, Johan, and others: does anyone
> >                 object to that approach?
> >                 >
> >                 > applying such a calling convention patch to llvm is
> >                 really quite
> >                 > straightforward, and the build process is pretty
> >                 zippy after that too.
> >                 >
> >                 > cheers
> >                 > -Carter
> >                 >
> >                 >
> >                 > On Thu, Sep 12, 2013 at 2:34 AM, Carter Schonwald
> >                 > <carter.schonwald at gmail.com
> >                 <mailto:carter.schonwald at gmail.com>
> >                 <mailto:carter.schonwald at gmail.com
> >                 <mailto:carter.schonwald at gmail.com>>> wrote:
> >                 >
> >                 >     that said it does occur to me that there is an
> >                 alternative
> >                 >     solution that may be acceptable for everyone!
> >                 >
> >                 >     what about providing a pseudo compatible way called
> >                 >     -fllvm-experimentalAVX (or something), and
> >                 simply require that for
> >                 >     it to be used, the user has an llvm Patched with
> >                 the YMM simd in
> >                 >     register fun call support? internally that could
> >                 just be an llvm
> >                 >     way that trips the logic that puts the first few
> >                 AVX values in
> >                 >     those YMM1-6 slots if they are the first args,
> >                 so only the stack
> >                 >     spilling logic needs be changed?
> >                 >
> >                 >     (ie it wouldn't be tied to an llvm version, but
> >                 rather this pseduo
> >                 >     way flag)
> >                 >
> >                 >     does that make sense?
> >                 >
> >                 >     either way, i'd really like having avx even if
> >                 its always spilled
> >                 >     to stack at funcalls with standard LLVMs!
> >                 >
> >                 >     cheers
> >                 >     -carter
> >                 >
> >                 >
> >                 >
> >                 >
> >                 >     On Thu, Sep 12, 2013 at 2:28 AM, Carter Schonwald
> >                 >     <carter.schonwald at gmail.com
> >                 <mailto:carter.schonwald at gmail.com>
> >                 <mailto:carter.schonwald at gmail.com
> >                 <mailto:carter.schonwald at gmail.com>>>
> >                 >     wrote:
> >                 >
> >                 >         Geoff,
> >                 >
> >                 >         a prosaic reason why there *might* be a
> >                 fundamentally breaking
> >                 >         change would be the following idea nathan
> >                 howell suggested to
> >                 >         me this afternoon: change the Sp and SPLim
> >                 register so that
> >                 >         the X86/x86_64 target can use the CPU's Push
> >                 and (maybe) Pop
> >                 >         instructions for the  stack manipulations,
> >                 rather than MOV and
> >                 >         fam.  see
> >                 http://ghc.haskell.org/trac/ghc/ticket/8272 (which
> >                 >         is just what i've said). Thats one change
> >                 thats pretty simple
> >                 >         but deep, but likely worth exploring.
> >                 >
> >                 >
> >                 >         i'm saying any ABI change for GHC 7.10,
> >                 would likely entail
> >                 >         patching LLVM 3.4, because thats the only
> >                 LLVM version likely
> >                 >         to come out between now and whenever we get
> >                 7.10 out (assuming
> >                 >         7.10 lands within the next 8-12 months,
> >                 which is reasonable
> >                 >         since we've got noticeably more (amazing)
> >                 people  helping out
> >                 >         lately). Thus, any change there entails
> >                 either asking the llvm
> >                 >         folks to support >1 GHC convention per
> >                 architecture, or
> >                 >         replace the current one!  I'd rather do the
> >                 latter than the
> >                 >         former, when it comes to asking other people
> >                 to maintain it :)
> >                 >         (and llvm engineers do in fact help out
> >                 maintaining that code)
> >                 >
> >                 >
> >                 >         have you run a Nofib, or even benchmarks
> >                 restricted to your
> >                 >         multivector code, for the current calling
> >                 convention
> >                 >         (including the spilling AVX vectors to the
> >                 stack thats the
> >                 >         current plan i gather) VS passing in
> >                 registers with an LLVM
> >                 >         built using the patches i worked out ~ 2
> >                 months ago?  it'd be
> >                 >         really easy to build that custom llvm, then
> >                 run the
> >                 >         benchmarks! (i'm happy to help, and
> >                 ultimately, benchmarks
> >                 >         will reveal if its worth while or not! And
> >                 if the main goal is
> >                 >         for your talk, its still valid even if its
> >                 not in the merge
> >                 >         window over the next 4 days).
> >                 >
> >                 >         I really think its not obvious what the
> >                 "best" abi
> >                 >         change would be! It really will require
> >                 coming up with a list
> >                 >         of variants, implementing them, and running
> >                 nofib with each
> >                 >         variant, which i lack the compute/human time
> >                 resources to do
> >                 >         this week. Modern hardware is complex enough
> >                 that for
> >                 >         something like an ABI change, the only
> >                 healthy attitude can be
> >                 >         "lets benchmark it!".
> >                 >
> >                 >         i'd really like any change in calling
> >                 convention to also
> >                 >         improve perf on codes that aren't explicitly
> >                 simd! (and a
> >                 >         conservative simd only change,
> >                 blocks/conflicts with that
> >                 >         augmentation going forward, and not just for
> >                 the stack pointer
> >                 >         example i mention early)
> >                 >
> >                 >          Not just scalar floats in simd registers ,
> >                 but perhaps also
> >                 >         words/ints !
> >                 >
> >                 >         (though that latter bit  might be pretty
> >                 ambitious and subtle,
> >                 >         i'll need to investigate that a bit to see
> >                 how feasible it may
> >                 >         be).
> >                 >         SIMD has great support for  ints/words, and
> >                 any partial abi
> >                 >         change on the llvm backend now would make it
> >                 hard to support
> >                 >         that later well (or at least, thats what it
> >                 looks like to me).
> >                 >          actually effectively using simd for scalar
> >                 ints and words
> >                 >         should be doable, but might force us to be a
> >                 bit more
> >                 >         thoughtful on how GHC internally
> >                 distinguishes ints used for
> >                 >         address arithmetic, vs ints used as data.
> >                  (interestingly, i'm
> >                 >         not sure if any current extent x86 calling
> >                 convention does that!)
> >                 >
> >                 >
> >                 >             That single change would make 7.10
> >                 require a completely
> >                 >         different llvm and native code gen
> >                 convention from our current
> >                 >         one, plus touch all of the code gen on x86
> >                 architectures.
> >                 >
> >                 >
> >                 >         basically: we're lucky that everyone builds
> >                 haskell code from
> >                 >         source, so ABI compat across GHC versions is
> >                 a non issue. BUT,
> >                 >         any ABI changes should be backed by
> >                 benchmarks (at least when
> >                 >         the change is performance motivated).
> >                 Likewise, because we use
> >                 >         LLVM as an external dep for the -fllvm
> >                 backend, we really need
> >                 >         to keep how their release cycle interacts
> >                 with our release
> >                 >         cycle, because people use haskell and ghc!
> >                 which as many like
> >                 >         to say, is both a boon and a pain ;).
> >                 >
> >                 >         Having people hit ghc acting broken with an
> >                 llvm that was
> >                 >         "supported before" is  risky support problem
> >                 to deal with.
> >                 >         having an LLVM head variant support a
> >                 modified ABI, and then
> >                 >         later needing to break it for 7.10 (for one
> >                 of the possible
> >                 >         exploratory reasons above) would lead to a
> >                 support headache I
> >                 >         don't wish on anyone.
> >                 >
> >                 >         pardon the verbose answer, but thats my
> >                 offhand take
> >                 >
> >                 >         cheers
> >                 >         -Carter
> >                 >
> >                 >
> >                 >         On Wed, Sep 11, 2013 at 10:10 PM, Geoffrey
> >                 Mainland
> >                 >         <mainland at apeiron.net
> >                 <mailto:mainland at apeiron.net>
> >                 <mailto:mainland at apeiron.net
> >                 <mailto:mainland at apeiron.net>>> wrote:
> >                 >
> >                 >             We support compiling some code with
> >                 -fllvm and some not in
> >                 >             the same
> >                 >             executable. Otherwise how could users of
> >                 the Haskell
> >                 >             Platform link their
> >                 >             -fllvm-compiled code with
> >                 native-codegen-compiled
> >                 >             libraries like base, etc.?
> >                 >
> >                 >             In other words, the LLVM and native back
> >                 ends use the same
> >                 >             calling
> >                 >             convention. With my SIMD work, they
> >                 still use the same calling
> >                 >             conventions, but the native codegen can
> >                 never generate
> >                 >             code that uses
> >                 >             SIMD instructions.
> >                 >
> >                 >             Geoff
> >                 >
> >                 >             On 09/11/2013 10:03 PM, Johan Tibell wrote:
> >                 >             > OK. But that doesn't create a problem
> >                 for the code we
> >                 >             output with the
> >                 >             > LLVM backend, no? Or do we support
> >                 compiling some code
> >                 >             with -fllvm and
> >                 >             > some not in the same executable?
> >                 >             >
> >                 >             >
> >                 >             > On Wed, Sep 11, 2013 at 6:56 PM,
> >                 Geoffrey Mainland
> >                 >             > <mainland at apeiron.net
> >                 <mailto:mainland at apeiron.net>
> >                 <mailto:mainland at apeiron.net
> >                 <mailto:mainland at apeiron.net>>
> >                 >             <mailto:mainland at apeiron.net
> >                 <mailto:mainland at apeiron.net>
> >                 >             <mailto:mainland at apeiron.net
> >                 <mailto:mainland at apeiron.net>>>> wrote:
> >                 >             >
> >                 >             >     We definitely have interop between
> >                 the native
> >                 >             codegen and the LLVM
> >                 >             >     back
> >                 >             >     end now. Otherwise anyone who
> >                 wanted to use the LLVM
> >                 >             back end
> >                 >             >     would have
> >                 >             >     to build GHC themselves. Interop
> >                 means that users
> >                 >             can install the
> >                 >             >     Haskell Platform and still use
> >                 -fllvm when it makes
> >                 >             a performance
> >                 >             >     difference.
> >                 >             >
> >                 >             >     Geoff
> >                 >             >
> >                 >             >     On 09/11/2013 07:59 PM, Johan
> >                 Tibell wrote:
> >                 >             >     > Do nothing different than you're
> >                 doing for 7.8, we
> >                 >             can sort it out
> >                 >             >     > later. Just put a comment on the
> >                 primops saying
> >                 >             they're
> >                 >             >     LLVM-only. See
> >                 >             >     > e.g.
> >                 >             >     >
> >                 >             >     >
> >                 >             >     >
> >                 >             >
> >                 >
> >
> https://github.com/ghc/ghc/blob/master/compiler/prelude/primops.txt.pp#L181
> >                 >             >     >
> >                 >             >     > for an example how to add docs
> >                 to primops.
> >                 >             >     >
> >                 >             >     > I don't think we need interop
> >                 between the native
> >                 >             and the LLVM
> >                 >             >     > backends. We don't have that now
> >                 do we (i.e. they
> >                 >             use different
> >                 >             >     > calling conventions).
> >                 >             >     >
> >                 >             >     >
> >                 >             >     >
> >                 >             >     > On Wed, Sep 11, 2013 at 4:51 PM,
> >                 Geoffrey Mainland
> >                 >             >     > <mainland at apeiron.net
> >                 <mailto:mainland at apeiron.net>
> >                 >             <mailto:mainland at apeiron.net
> >                 <mailto:mainland at apeiron.net>>
> >                 <mailto:mainland at apeiron.net <mailto:
> mainland at apeiron.net>
> >                 >             <mailto:mainland at apeiron.net
> >                 <mailto:mainland at apeiron.net>>>
> >                 >             >     <mailto:mainland at apeiron.net
> >                 <mailto:mainland at apeiron.net>
> >                 >             <mailto:mainland at apeiron.net
> >                 <mailto:mainland at apeiron.net>>
> >                 <mailto:mainland at apeiron.net <mailto:
> mainland at apeiron.net>
> >                 >             <mailto:mainland at apeiron.net
> >                 <mailto:mainland at apeiron.net>>>>> wrote:
> >                 >             >     >
> >                 >             >     >     On 09/11/2013 07:44 PM,
> >                 Johan Tibell wrote:
> >                 >             >     >     > On Wed, Sep 11, 2013 at
> >                 4:40 PM, Geoffrey
> >                 >             Mainland
> >                 >             >     >     <mainland at apeiron.net
> >                 <mailto:mainland at apeiron.net>
> >                 >             <mailto:mainland at apeiron.net
> >                 <mailto:mainland at apeiron.net>>
> >                 <mailto:mainland at apeiron.net <mailto:
> mainland at apeiron.net>
> >                 >             <mailto:mainland at apeiron.net
> >                 <mailto:mainland at apeiron.net>>>
> >                 >             >     <mailto:mainland at apeiron.net
> >                 <mailto:mainland at apeiron.net>
> >                 >             <mailto:mainland at apeiron.net
> >                 <mailto:mainland at apeiron.net>>
> >                 <mailto:mainland at apeiron.net <mailto:
> mainland at apeiron.net>
> >                 >             <mailto:mainland at apeiron.net
> >                 <mailto:mainland at apeiron.net>>>>> wrote:
> >                 >             >     >     > > Do you mean we need a
> >                 reasonable emulation
> >                 >             of the SIMD
> >                 >             >     primops for
> >                 >             >     >     > > the native codegen?
> >                 >             >     >     >
> >                 >             >     >     > Yes. Reasonable in the
> >                 sense that it
> >                 >             computes the right
> >                 >             >     result.
> >                 >             >     >     I can
> >                 >             >     >     > see that some code might
> >                 still want to
> >                 >             #ifdef (if the
> >                 >             >     fallback isn't
> >                 >             >     >     > fast enough).
> >                 >             >     >
> >                 >             >     >     Two implications of this
> >                 requirement:
> >                 >             >     >
> >                 >             >     >     1) There will not be SIMD in
> >                 7.8. I just don't
> >                 >             have the
> >                 >             >     time. In fact,
> >                 >             >     >     what SIMD support is there
> >                 already will have
> >                 >             to be removed if we
> >                 >             >     >     cannot
> >                 >             >     >     live with LLVM-only SIMD
> >                 primops.
> >                 >             >     >
> >                 >             >     >     2) If we also require
> >                 interop between the LLVM
> >                 >             back-end and
> >                 >             >     the native
> >                 >             >     >     codegen, then we cannot pass
> >                 any SIMD vectors in
> >                 >             >     registers---they all
> >                 >             >     >     must be passed on the stack.
> >                 >             >     >
> >                 >             >     >     My plan, as discussed with
> >                 Simon PJ, is to not
> >                 >             support SIMD
> >                 >             >     primops at
> >                 >             >     >     all with the native codegen.
> >                 If there is a
> >                 >             strong feeling that
> >                 >             >     >     this *is
> >                 >             >     >     not* the way to go, the I
> >                 need to know ASAP.
> >                 >             >     >
> >                 >             >     >     Geoff
> >                 >             >     >
> >                 >             >     >
> >                 >             >     >
> >                 >             >
> >                 >             >
> >                 >
> >                 >
> >                 >
> >                 >
> >
> >
> >
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20130913/82e016d6/attachment.htm>