LLVM calling convention for AVX2 and AVX512 registers

Wed Mar 15 19:37:20 UTC 2017

to reiterate: any automated lowering / shimming scheme will hurt any
serious user of simd who isn't treating it as some black box abstraction.
And those are the very users who are equipped to write / design libraries /
ghc improvements that let still *other* users pretend to have a mostly
decent black box abstraction. Our compiler engineering bandwidth is not
enough to start with any automagic in this problem domain that isn't
validated with a model implementation in user space.

On Wed, Mar 15, 2017 at 3:31 PM, Carter Schonwald <
carter.schonwald at gmail.com> wrote:

> agreed. and the generic vector size stuff in llvm is both pretty naive,
> AND not the sane/tractable way to add SIMD support to the NCG,
>
> i'm totally ok with my vector sizes that are available depending on the
> target CPU or whatever. Operating systems have very sane errors for that
> sort of mishap,
>
> On Wed, Mar 15, 2017 at 3:29 PM, Edward Kmett <ekmett at gmail.com> wrote:
>
>> Currently if you try to use a DoubleX4# and don't have AVX2 turned on, it
>> deliberately crashes out during code generation, no? So this is very
>> deliberately *not* a problem with the current setup as I understand it.
>> It only becomes one if we reverse the decision and decide to add terribly
>> inefficient shims for this functionality at the primop level rather than
>> have a higher level make the right call to just not use functionality that
>> isn't present on the target platform.
>>
>> -Edward
>>
>>
>> On Wed, Mar 15, 2017 at 10:27 AM, Ben Gamari <ben at smart-cactus.org>
>> wrote:
>>
>>> Siddhanathan Shanmugam <siddhanathan+eml at gmail.com> writes:
>>>
>>> >> I would be happy to advise if you would like to pick this up.
>>> >
>>> > Thanks Ben!
>>> >
>>> >> This would mean that Haskell libraries compiled with different flags
>>> >> would not be ABI compatible.
>>> >
>>> > Wait, can we not maintain ABI compatibility if we limit the target
>>> > features using a compiler flag? Sometimes (for performance reasons)
>>> > it's reasonable to request the compiler to only generate SSE
>>> > instructions, even if AVX2 is available on the target. On GCC we can
>>> > use the flag -msse to do just that.
>>> >
>>> I think the reasoning here is the following (please excuse the rather
>>> contrived example): Consider a function f with two variants,
>>>
>>>     module AvxImpl where
>>>     {-# OPTIONS_GHC -mavx #-}
>>>     f :: DoubleX4# -> DoubleX4# -> Double
>>>
>>>     module SseImpl where
>>>     {-# OPTIONS_GHC -msse #-}
>>>     f :: DoubleX4# -> DoubleX4# -> Double
>>>
>>> If we allow GHC to pass arguments with SIMD registers we now have a bit
>>> of a conundrum: The calling convention for AvxImpl.f will require that
>>> we pass the two arguments in YMM registers, whereas SseImpl.f will
>>> be via passed some other means (perhaps two pairs of XMM registers).
>>>
>>> In the C world this isn't a problem AFAIK since intrinsic types map
>>> directly to register classes. Consequently, I can look at a C
>>> declaration type,
>>>
>>>     double f(__m256 x, __m256 y);
>>>
>>> and tell you precisely the calling convention that would be used. In
>>> GHC, however, we have an abstract vector model and therefore the calling
>>> convention is determined by which ISA the compiler is targetting.
>>>
>>> I really don't know how to fix this "correctly". Currently we assume
>>> that there is a static mapping between STG registers and machine
>>> registers. Giving this up sounds quite painful.
>>>
>>> Cheers,
>>>
>>> - Ben
>>>
>>> _______________________________________________
>>> ghc-devs mailing list
>>> ghc-devs at haskell.org
>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20170315/f4692e98/attachment-0001.html>