Vector primops sizes
alexander.kjeldaas at gmail.com
Thu Feb 14 12:44:22 CET 2013
I mentioned this in another thread, but Xeon Phi chips have 512-bit AVX,
and Intel has apparently implemented support in LLVM for the ispc compiler.
Also apparently this hasn't been merged back yet, but I guess it is only a
matter of time.
The Intel MIC architecture isn't quite x86 though.
On Thu, Feb 14, 2013 at 12:29 AM, Geoffrey Mainland <mainland at apeiron.net>wrote:
> I haven't seen Michael's patches (where are they btw?), but there is
> some extra work to be done to ensure that 256-bit values are passed in
> registers. Otherwise adding support for wider vector types is fairly
> The current plan is for 256-bit wide vector primops to always be
> available. The programmer can test for the __AVX__ CPP symbol, which
> indicates that these primops will be compiled to efficient code. I am
> not inclined to add wider vector primops, as there is no current
> platform where they can be compiled efficiently.
> Most programmers should use the Multi type family instead of working
> with primops (or their boxed wrappers) directly. For example, by using
> Multi Double instead of DoubleX2, the programmer will get 256-bit wide
> vectors on platforms that support AVX, and 128-bit wide vectors
> otherwise. See https://github.com/mainland/primitive for details.
> On 02/13/2013 07:44 AM, Simon Peyton-Jones wrote:
> > I believe Geoff is working on adding AVX. I expect he’d be interested
> > in your patches.
> > Simon
> > *From:*ghc-devs-bounces at haskell.org
> > [mailto:ghc-devs-bounces at haskell.org] *On Behalf Of *Carter Schonwald
> > *Sent:* 13 February 2013 05:59
> > *To:* Michael Baikov
> > *Cc:* ghc-devs at haskell.org
> > *Subject:* Re: Vector primops sizes
> > Yes please! having these (for valid target arches/ CPU targets) would
> > be really really valuable for me.
> > On Feb 13, 2013 12:07 AM, "Michael Baikov" <manpacket at gmail.com
> > <mailto:manpacket at gmail.com>> wrote:
> >> Recently merged vector primops support only 16 bytes operands - Int32
> >> x 4, Double x 2 and so on. Current AVX instructions support 256 bit
> >> operands and with simple cut'n'paste work it's possible to support at
> >> least Double x 4 operands. I made those changes and GHC generates
> >> (using llvm) proper AVX code using ymm registers. Also it might make
> >> sense to support primops for vector types larger than any currently
> >> supported primitive types - I have those changes in my branch as well
> >> and llvm generates pretty good code as well - those changes might be
> >> useful to provide access for llvm shufflevector instruction or writing
> >> high performance processing of large vectors - with less potential
> >> overhead.
> >> Do we want to support larger vectors directly or ghc should be made
> >> smart enough to fuse operations with vector primops performed in
> >> parallel into larger vectors/registers for llvm? Do we want to provide
> >> access to llvm shufflevector instruction?
> >> _______________________________________________
> >> ghc-devs mailing list
> >> ghc-devs at haskell.org <mailto:ghc-devs at haskell.org>
> >> http://www.haskell.org/mailman/listinfo/ghc-devs
> > _______________________________________________
> > ghc-devs mailing list
> > ghc-devs at haskell.org
> > http://www.haskell.org/mailman/listinfo/ghc-devs
> ghc-devs mailing list
> ghc-devs at haskell.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the ghc-devs