<div dir="ltr">zooming out: <div><br></div><div>what *should* the new ABI be?</div><div><br></div><div>Ed was suggesting we make all 16 xmm/ymm/ lower 16 zmm registers (depending on how they're being used) caller save, </div><div><br></div><div>(what about all 32 zmm registers? would they be float only, or also for ints/words? simd has lots of nice int support!)</div><div><br></div><div>a) if this doesn't cause any perf regressions i've no objections</div><div><br></div><div>b) currently we only support passing floats/doubles and simd vectors of , do we wanna support int/word data there too? (or are the GPR / general purpose registers enough for those? )</div><div><br></div><div>c) other stuff i'm probably overlooking</div><div><br></div><div>d) lets do this!</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Mar 9, 2017 at 3:31 PM, Carter Schonwald <span dir="ltr"><<a href="mailto:carter.schonwald@gmail.com" target="_blank">carter.schonwald@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">the patch is still on TRAC,<div><br></div><div><a href="https://ghc.haskell.org/trac/ghc/ticket/8033" target="_blank">https://ghc.haskell.org/trac/<wbr>ghc/ticket/8033</a><br></div><div><br></div><div>we need to do changes to both the 32bit and 64bit ABIs, and I think thats where I got stalled from lack of feedback</div><div><br></div><div>that aside:</div><div><br></div><div>heres the original email thread on the llvm commits thread </div><div><a href="http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20130708/180264.html" target="_blank">http://lists.llvm.org/<wbr>pipermail/llvm-commits/Week-<wbr>of-Mon-20130708/180264.html</a><br></div><div><br></div><div>and theres links from there to the iterating on the test suite plus the original patch</div><div><br></div><div>i'm more than happy to take a weekend to do the leg work, it was pretty fun last time.</div><div><br></div><div>BUT, we need to agree on what ABI to do, and make sure that those ABI changes dont create a performance regression for some unexpected reason.</div></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Mar 9, 2017 at 3:11 PM, Geoffrey Mainland <span dir="ltr"><<a href="mailto:mainland@apeiron.net" target="_blank">mainland@apeiron.net</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">We would need to get a patch to LLVM accepted to change the GHC calling<br>
convention.<br>
<br>
Now that we commit to a particular version of LLVM, this might be less<br>
of an issue than it once was since we wouldn't have to support versions<br>
of LLVM that didn't support the new calling convention.<br>
<br>
So...how do we get a patch into LLVM? I believe I once had such a patch<br>
ready to go...I will dig around for it, but the change is very small and<br>
easily recreated.<br>
<br>
It would be even better if we could *also* teach the native back end<br>
about SSE instructions. Is there anyone who might be willing to work on<br>
that?<br>
<br>
Geoff<br>
<div class="m_6176870787595274040HOEnZb"><div class="m_6176870787595274040h5"><br>
On 3/9/17 2:30 PM, Edward Kmett wrote:<br>
> Back around 2013, Geoff raised a discussion about fixing up the GHC<br>
> ABI so that the LLVM calling convention could pass 256 bit vector<br>
> types in YMM (and, i suppose now 512 bit vector types in ZMM).<br>
><br>
> As I recall, this was blocked by some short term concerns about which<br>
> LLVM release was imminent or what have you. Four years on, the exact<br>
> same sort of arguments could be dredged up, but yet in the meantime<br>
> nobody is really using those types for anything.<br>
><br>
> This still creates a pain point around trying to use these wide types<br>
> today. Spilling rather than passing them in registers adds a LOT of<br>
> overhead to any attempt to use them that virtually erases any benefit<br>
> to having them in the first place.<br>
><br>
> I started experimenting with writing some custom primops directly in<br>
> llvm so I could do meaningful amounts of work with our SIMD vector<br>
> types by just banging out the code that we can't write in haskell<br>
> directly using llvm assembly, and hoping I could trick LLVM to do link<br>
> time optimization to perhaps inline it, but I'm basically dead in the<br>
> water over the overhead of our current calling convention, before I<br>
> even start, it seems, as if we're spilling them there is no way that<br>
> inlining / LTO could hope to figure out what we're doing out as part<br>
> of the spill to erase that call entirely.<br>
><br>
> It is rather frustrating that I can't even cheat. =/<br>
><br>
> What do we need to do to finally fix this?<br>
><br>
> -Edward<br>
<br>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>