LLVM calling convention for AVX2 and AVX512 registers

Wed Mar 15 00:49:06 UTC 2017

This thread is getting into a broader discussion about target specific
intrincsics as user prims vs compiler generated.

@ben - ed is talking about stuff like a function call that's using a
specific avx2 intrinsic, not the parameterized vector abstraction. LLvm
shouldn't be lowering those. ... or clang has issues :/

On Tue, Mar 14, 2017 at 4:33 PM Geoffrey Mainland <mainland at apeiron.net>
wrote:

> On 03/14/2017 04:02 PM, Ben Gamari wrote:
> > Edward Kmett <ekmett at gmail.com> writes:
> >
> >> Hrmm. In C/C++ I can tell individual functions to turn on additional ISA
> >> feature sets with compiler-specific __attribute__((target("avx2")))
> tricks.
> >> This avoids complains from the compiler when I call builtins that aren't
> >> available at my current compilation feature level. Perhaps pragmas for
> the
> >> codegen along those lines is what we'd ultimately need? Alternately, if
> we
> >> simply distinguish between what the ghc codegen produces with one set of
> >> options and what we're allowed to ask for explicitly with another then
> >> user-land tricks like I employ would remain sound.
> >>
> > I'm actually not sure that simply distinguishing between the user- and
> > codegen-allowed ISA extensions is quite sufficient. Afterall, AFAIK LLVM
> > doesn't make such a distinction itself: AFAIK if you write a vector
> > primitive and compile for a target that doesn't have an appropriate
> > instruction the code-generator will lower it with software emulation.
>
> This would mean that Haskell libraries compiled with different flags
> would not be ABI compatible.
>
> Our original paper exposed a Multi type class that was meant to be the
> programmer interface to the primops. A Multi a would be the widest
> vector type supported on the current architecture, so code that used a
> Multi Double would always be guaranteed to work at the widest vector
> type available for Double's.
>
> The Multi approach explicitly eschewed lowering, but I would argue that
> if performance is the goal, then automatic lowering is not what you
> want. I would rather have the system pick the correct vector width for
> me based on the current architecture.
>
> This does nothing to solved the problem of ABI compatibility, which is
> one reason I didn't push to get this upstreamed.
>
> Is the Multi approach desirable? I think it would be nice to be able to
> at least provide such a solution even if it isn't some sort of default.
> Do we really want lowering of wider vector types?
>
> Geoff
>
> > However, adding a pragma to allow per-function target annotations seems
> > quite reasonable and easily doable. Moreover, contrary to my previous
> > assertion, it shouldn't require any splitting of compilation units. I
> > ran a quick experiment, compiling this program,
> >
> >     __attribute__((target("sse2"))) int hello() {
> >       return 1;
> >     }
> >
> > With clang. It produced something like,
> >
> >     define i32 @hello() #0 {
> >       ret i32 1
> >     }
> >
> >     attributes #0 = { "target-cpu"="x86-64"
> "target-features"="+fxsr,+mmx,+sse,+sse2,+x87" ... }
> >
> > So it seems LLVM is perfectly capable of expressing this; in hindsight
> > I'm not sure why I ever doubted this.
> >
> > There are a number of details that would need to be worked out regarding
> > how such a pragma should behave. Does the general direction sound
> > reasonable? I've opened #13427 [1] to track this idea.
> >
> > Cheers,
> >
> > - Ben
> >
> >
> > [1] https://ghc.haskell.org/trac/ghc/ticket/13427
>
>
>
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20170315/ea2e5d2a/attachment.html>