possible solution! Re: llvm calling convention matters

Fri Sep 13 02:55:35 UTC 2013

The plan is as I wrote below:

  7.8 will only support passing 128-bit SIMD vectors in registers on x86-64.
  Other vectors sizes, and all vectors on x86-32, will be passed on the
stack.

There is not enough time for anything else at his point.

Geoff

On 09/12/2013 10:40 PM, Carter Schonwald wrote:
> let me know before the weekend starts.... so i can make time to help
> if need be (unless Austin gives breathing room on merge window for
> such a thing)
>
>
> On Thu, Sep 12, 2013 at 3:03 PM, Carter Schonwald
> <carter.schonwald at gmail.com <mailto:carter.schonwald at gmail.com>> wrote:
>
>     emphasis on "very very clear warning"
>
>
>     On Thu, Sep 12, 2013 at 3:00 PM, Carter Schonwald
>     <carter.schonwald at gmail.com <mailto:carter.schonwald at gmail.com>>
>     wrote:
>
>         after a bit more reflection: as long as we provide a clear
>         warning that 7.8 may at some point no longer work with llvm
>         3.4, i'm down for the change. We just need to make it very
>         very clear, that it may stop working. (and have AVX support
>         via passing on the stack with <= 3.3)
>
>         before i go and upstream that patch, could we benchmark how
>         multivector perf fairs with  patched llvm? i don't have the
>         right hardware for doing the benchmarks you did in your paper...
>
>         sorry for being a bit over the top yesterday, i'm just
>         juggling a lot right now :) 
>
>         -Carter
>
>
>         On Thu, Sep 12, 2013 at 2:47 PM, Carter Schonwald
>         <carter.schonwald at gmail.com
>         <mailto:carter.schonwald at gmail.com>> wrote:
>
>             oh, i didn't realize you had already done the work! (bah,
>             i'm sorry, i feel terrible)
>
>             I thought i had communicated ~ a month ago that I was
>             worried about release engineering interaction with making
>             it impossible to then make a subsequent changes more
>             thoughtfully because of the LLVM release cycle. This
>             concern of mine balloned a bit after helping triage a huge
>             number of problems people were hitting with the Clang
>             transition on mac thats underway. 
>
>             Its actually very easy to package up an llvm with that
>             patch, much simpler than "build GHC from source". In fact,
>             on OS X, the simplest way to install LLVM by default
>             essentially does a build from source. 
>
>             Geoff, it'd at least be worth running the benchmarks to
>             measure the work! (and as I said, i'm happy to help)
>
>
>             On Thu, Sep 12, 2013 at 2:30 PM, Geoffrey Mainland
>             <mainland at apeiron.net <mailto:mainland at apeiron.net>> wrote:
>
>                 If users have to do a custom llvm build, we might as
>                 well ask them to
>                 build ghc from source too.
>
>                 Unless I misunderstood ticket #8033, you were
>                 originally quite gung-ho
>                 about changing the LLVM calling conventions to support
>                 passing SIMD
>                 vectors of all widths in registers on both x86-32 and
>                 -64, getting these
>                 patches into LLVM 3.4, and making sure that GHC 7.8
>                 would support all
>                 this. I spent several days making sure this could
>                 happen from the GHC
>                 side. Now that the plan has changed, I will back out
>                 that work, and 7.8
>                 will only support passing 128-bit SIMD vectors in
>                 registers on x86-64.
>                 Other vectors sizes, and all vectors on x86-32, will
>                 be passed on the stack.
>
>                 Geoff
>
>                 On 9/12/13 1:32 PM, Carter Schonwald wrote:
>                 > to repeat:
>                 >
>                 >  I think no one would have object to having a
>                 clearly marked,
>                 > experimental -fllvmExpermentalAVX flag that requires
>                 building LLVM
>                 > with a specified patch, as a way to showcase your
>                 multivector work!
>                 >
>                 > that would evade all of my objections (provided avx
>                 is still exposed
>                 > with normal -fllvm, but spilled to stack rather than
>                 registers), and
>                 > i'd actually argue in favor of such.
>                 >
>                 > Especially since it would not impose any release
>                 cycle constraints  on
>                 > a subsequent, systematic exploration for using XMM /
>                 YMM / ZMM  in the
>                 > calling convention going forward.
>                 >
>                 > @Geoff, Simons, Johan, and others: does anyone
>                 object to that approach?
>                 >
>                 > applying such a calling convention patch to llvm is
>                 really quite
>                 > straightforward, and the build process is pretty
>                 zippy after that too.
>                 >
>                 > cheers
>                 > -Carter
>                 >
>                 >
>                 > On Thu, Sep 12, 2013 at 2:34 AM, Carter Schonwald
>                 > <carter.schonwald at gmail.com
>                 <mailto:carter.schonwald at gmail.com>
>                 <mailto:carter.schonwald at gmail.com
>                 <mailto:carter.schonwald at gmail.com>>> wrote:
>                 >
>                 >     that said it does occur to me that there is an
>                 alternative
>                 >     solution that may be acceptable for everyone!
>                 >
>                 >     what about providing a pseudo compatible way called
>                 >     -fllvm-experimentalAVX (or something), and
>                 simply require that for
>                 >     it to be used, the user has an llvm Patched with
>                 the YMM simd in
>                 >     register fun call support? internally that could
>                 just be an llvm
>                 >     way that trips the logic that puts the first few
>                 AVX values in
>                 >     those YMM1-6 slots if they are the first args,
>                 so only the stack
>                 >     spilling logic needs be changed?
>                 >
>                 >     (ie it wouldn't be tied to an llvm version, but
>                 rather this pseduo
>                 >     way flag)
>                 >
>                 >     does that make sense?
>                 >
>                 >     either way, i'd really like having avx even if
>                 its always spilled
>                 >     to stack at funcalls with standard LLVMs!
>                 >
>                 >     cheers
>                 >     -carter
>                 >
>                 >
>                 >
>                 >
>                 >     On Thu, Sep 12, 2013 at 2:28 AM, Carter Schonwald
>                 >     <carter.schonwald at gmail.com
>                 <mailto:carter.schonwald at gmail.com>
>                 <mailto:carter.schonwald at gmail.com
>                 <mailto:carter.schonwald at gmail.com>>>
>                 >     wrote:
>                 >
>                 >         Geoff,
>                 >
>                 >         a prosaic reason why there *might* be a
>                 fundamentally breaking
>                 >         change would be the following idea nathan
>                 howell suggested to
>                 >         me this afternoon: change the Sp and SPLim
>                 register so that
>                 >         the X86/x86_64 target can use the CPU's Push
>                 and (maybe) Pop
>                 >         instructions for the  stack manipulations,
>                 rather than MOV and
>                 >         fam.  see
>                 http://ghc.haskell.org/trac/ghc/ticket/8272 (which
>                 >         is just what i've said). Thats one change
>                 thats pretty simple
>                 >         but deep, but likely worth exploring.
>                 >
>                 >
>                 >         i'm saying any ABI change for GHC 7.10,
>                 would likely entail
>                 >         patching LLVM 3.4, because thats the only
>                 LLVM version likely
>                 >         to come out between now and whenever we get
>                 7.10 out (assuming
>                 >         7.10 lands within the next 8-12 months,
>                 which is reasonable
>                 >         since we've got noticeably more (amazing)
>                 people  helping out
>                 >         lately). Thus, any change there entails
>                 either asking the llvm
>                 >         folks to support >1 GHC convention per
>                 architecture, or
>                 >         replace the current one!  I'd rather do the
>                 latter than the
>                 >         former, when it comes to asking other people
>                 to maintain it :)
>                 >         (and llvm engineers do in fact help out
>                 maintaining that code)
>                 >
>                 >
>                 >         have you run a Nofib, or even benchmarks
>                 restricted to your
>                 >         multivector code, for the current calling
>                 convention
>                 >         (including the spilling AVX vectors to the
>                 stack thats the
>                 >         current plan i gather) VS passing in
>                 registers with an LLVM
>                 >         built using the patches i worked out ~ 2
>                 months ago?  it'd be
>                 >         really easy to build that custom llvm, then
>                 run the
>                 >         benchmarks! (i'm happy to help, and
>                 ultimately, benchmarks
>                 >         will reveal if its worth while or not! And
>                 if the main goal is
>                 >         for your talk, its still valid even if its
>                 not in the merge
>                 >         window over the next 4 days).
>                 >
>                 >         I really think its not obvious what the
>                 "best" abi
>                 >         change would be! It really will require
>                 coming up with a list
>                 >         of variants, implementing them, and running
>                 nofib with each
>                 >         variant, which i lack the compute/human time
>                 resources to do
>                 >         this week. Modern hardware is complex enough
>                 that for
>                 >         something like an ABI change, the only
>                 healthy attitude can be
>                 >         "lets benchmark it!".
>                 >
>                 >         i'd really like any change in calling
>                 convention to also
>                 >         improve perf on codes that aren't explicitly
>                 simd! (and a
>                 >         conservative simd only change,
>                 blocks/conflicts with that
>                 >         augmentation going forward, and not just for
>                 the stack pointer
>                 >         example i mention early)
>                 >
>                 >          Not just scalar floats in simd registers ,
>                 but perhaps also
>                 >         words/ints !
>                 >
>                 >         (though that latter bit  might be pretty
>                 ambitious and subtle,
>                 >         i'll need to investigate that a bit to see
>                 how feasible it may
>                 >         be).
>                 >         SIMD has great support for  ints/words, and
>                 any partial abi
>                 >         change on the llvm backend now would make it
>                 hard to support
>                 >         that later well (or at least, thats what it
>                 looks like to me).
>                 >          actually effectively using simd for scalar
>                 ints and words
>                 >         should be doable, but might force us to be a
>                 bit more
>                 >         thoughtful on how GHC internally
>                 distinguishes ints used for
>                 >         address arithmetic, vs ints used as data.
>                  (interestingly, i'm
>                 >         not sure if any current extent x86 calling
>                 convention does that!)
>                 >
>                 >
>                 >             That single change would make 7.10
>                 require a completely
>                 >         different llvm and native code gen
>                 convention from our current
>                 >         one, plus touch all of the code gen on x86
>                 architectures.
>                 >
>                 >
>                 >         basically: we're lucky that everyone builds
>                 haskell code from
>                 >         source, so ABI compat across GHC versions is
>                 a non issue. BUT,
>                 >         any ABI changes should be backed by
>                 benchmarks (at least when
>                 >         the change is performance motivated).
>                 Likewise, because we use
>                 >         LLVM as an external dep for the -fllvm
>                 backend, we really need
>                 >         to keep how their release cycle interacts
>                 with our release
>                 >         cycle, because people use haskell and ghc!
>                 which as many like
>                 >         to say, is both a boon and a pain ;).
>                 >
>                 >         Having people hit ghc acting broken with an
>                 llvm that was
>                 >         "supported before" is  risky support problem
>                 to deal with.
>                 >         having an LLVM head variant support a
>                 modified ABI, and then
>                 >         later needing to break it for 7.10 (for one
>                 of the possible
>                 >         exploratory reasons above) would lead to a
>                 support headache I
>                 >         don't wish on anyone.
>                 >
>                 >         pardon the verbose answer, but thats my
>                 offhand take
>                 >
>                 >         cheers
>                 >         -Carter
>                 >
>                 >
>                 >         On Wed, Sep 11, 2013 at 10:10 PM, Geoffrey
>                 Mainland
>                 >         <mainland at apeiron.net
>                 <mailto:mainland at apeiron.net>
>                 <mailto:mainland at apeiron.net
>                 <mailto:mainland at apeiron.net>>> wrote:
>                 >
>                 >             We support compiling some code with
>                 -fllvm and some not in
>                 >             the same
>                 >             executable. Otherwise how could users of
>                 the Haskell
>                 >             Platform link their
>                 >             -fllvm-compiled code with
>                 native-codegen-compiled
>                 >             libraries like base, etc.?
>                 >
>                 >             In other words, the LLVM and native back
>                 ends use the same
>                 >             calling
>                 >             convention. With my SIMD work, they
>                 still use the same calling
>                 >             conventions, but the native codegen can
>                 never generate
>                 >             code that uses
>                 >             SIMD instructions.
>                 >
>                 >             Geoff
>                 >
>                 >             On 09/11/2013 10:03 PM, Johan Tibell wrote:
>                 >             > OK. But that doesn't create a problem
>                 for the code we
>                 >             output with the
>                 >             > LLVM backend, no? Or do we support
>                 compiling some code
>                 >             with -fllvm and
>                 >             > some not in the same executable?
>                 >             >
>                 >             >
>                 >             > On Wed, Sep 11, 2013 at 6:56 PM,
>                 Geoffrey Mainland
>                 >             > <mainland at apeiron.net
>                 <mailto:mainland at apeiron.net>
>                 <mailto:mainland at apeiron.net
>                 <mailto:mainland at apeiron.net>>
>                 >             <mailto:mainland at apeiron.net
>                 <mailto:mainland at apeiron.net>
>                 >             <mailto:mainland at apeiron.net
>                 <mailto:mainland at apeiron.net>>>> wrote:
>                 >             >
>                 >             >     We definitely have interop between
>                 the native
>                 >             codegen and the LLVM
>                 >             >     back
>                 >             >     end now. Otherwise anyone who
>                 wanted to use the LLVM
>                 >             back end
>                 >             >     would have
>                 >             >     to build GHC themselves. Interop
>                 means that users
>                 >             can install the
>                 >             >     Haskell Platform and still use
>                 -fllvm when it makes
>                 >             a performance
>                 >             >     difference.
>                 >             >
>                 >             >     Geoff
>                 >             >
>                 >             >     On 09/11/2013 07:59 PM, Johan
>                 Tibell wrote:
>                 >             >     > Do nothing different than you're
>                 doing for 7.8, we
>                 >             can sort it out
>                 >             >     > later. Just put a comment on the
>                 primops saying
>                 >             they're
>                 >             >     LLVM-only. See
>                 >             >     > e.g.
>                 >             >     >
>                 >             >     >
>                 >             >     >
>                 >             >
>                 >            
>                 https://github.com/ghc/ghc/blob/master/compiler/prelude/primops.txt.pp#L181
>                 >             >     >
>                 >             >     > for an example how to add docs
>                 to primops.
>                 >             >     >
>                 >             >     > I don't think we need interop
>                 between the native
>                 >             and the LLVM
>                 >             >     > backends. We don't have that now
>                 do we (i.e. they
>                 >             use different
>                 >             >     > calling conventions).
>                 >             >     >
>                 >             >     >
>                 >             >     >
>                 >             >     > On Wed, Sep 11, 2013 at 4:51 PM,
>                 Geoffrey Mainland
>                 >             >     > <mainland at apeiron.net
>                 <mailto:mainland at apeiron.net>
>                 >             <mailto:mainland at apeiron.net
>                 <mailto:mainland at apeiron.net>>
>                 <mailto:mainland at apeiron.net <mailto:mainland at apeiron.net>
>                 >             <mailto:mainland at apeiron.net
>                 <mailto:mainland at apeiron.net>>>
>                 >             >     <mailto:mainland at apeiron.net
>                 <mailto:mainland at apeiron.net>
>                 >             <mailto:mainland at apeiron.net
>                 <mailto:mainland at apeiron.net>>
>                 <mailto:mainland at apeiron.net <mailto:mainland at apeiron.net>
>                 >             <mailto:mainland at apeiron.net
>                 <mailto:mainland at apeiron.net>>>>> wrote:
>                 >             >     >
>                 >             >     >     On 09/11/2013 07:44 PM,
>                 Johan Tibell wrote:
>                 >             >     >     > On Wed, Sep 11, 2013 at
>                 4:40 PM, Geoffrey
>                 >             Mainland
>                 >             >     >     <mainland at apeiron.net
>                 <mailto:mainland at apeiron.net>
>                 >             <mailto:mainland at apeiron.net
>                 <mailto:mainland at apeiron.net>>
>                 <mailto:mainland at apeiron.net <mailto:mainland at apeiron.net>
>                 >             <mailto:mainland at apeiron.net
>                 <mailto:mainland at apeiron.net>>>
>                 >             >     <mailto:mainland at apeiron.net
>                 <mailto:mainland at apeiron.net>
>                 >             <mailto:mainland at apeiron.net
>                 <mailto:mainland at apeiron.net>>
>                 <mailto:mainland at apeiron.net <mailto:mainland at apeiron.net>
>                 >             <mailto:mainland at apeiron.net
>                 <mailto:mainland at apeiron.net>>>>> wrote:
>                 >             >     >     > > Do you mean we need a
>                 reasonable emulation
>                 >             of the SIMD
>                 >             >     primops for
>                 >             >     >     > > the native codegen?
>                 >             >     >     >
>                 >             >     >     > Yes. Reasonable in the
>                 sense that it
>                 >             computes the right
>                 >             >     result.
>                 >             >     >     I can
>                 >             >     >     > see that some code might
>                 still want to
>                 >             #ifdef (if the
>                 >             >     fallback isn't
>                 >             >     >     > fast enough).
>                 >             >     >
>                 >             >     >     Two implications of this
>                 requirement:
>                 >             >     >
>                 >             >     >     1) There will not be SIMD in
>                 7.8. I just don't
>                 >             have the
>                 >             >     time. In fact,
>                 >             >     >     what SIMD support is there
>                 already will have
>                 >             to be removed if we
>                 >             >     >     cannot
>                 >             >     >     live with LLVM-only SIMD
>                 primops.
>                 >             >     >
>                 >             >     >     2) If we also require
>                 interop between the LLVM
>                 >             back-end and
>                 >             >     the native
>                 >             >     >     codegen, then we cannot pass
>                 any SIMD vectors in
>                 >             >     registers---they all
>                 >             >     >     must be passed on the stack.
>                 >             >     >
>                 >             >     >     My plan, as discussed with
>                 Simon PJ, is to not
>                 >             support SIMD
>                 >             >     primops at
>                 >             >     >     all with the native codegen.
>                 If there is a
>                 >             strong feeling that
>                 >             >     >     this *is
>                 >             >     >     not* the way to go, the I
>                 need to know ASAP.
>                 >             >     >
>                 >             >     >     Geoff
>                 >             >     >
>                 >             >     >
>                 >             >     >
>                 >             >
>                 >             >
>                 >
>                 >
>                 >
>                 >
>
>
>
>
>