<div dir="ltr">Ugh. I apparently had a misunderstanding about how that was compiled.<div><br></div><div>-Edward</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Mar 15, 2017 at 5:14 PM, Ben Gamari <span dir="ltr"><<a href="mailto:ben@smart-cactus.org" target="_blank">ben@smart-cactus.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">Edward Kmett <<a href="mailto:ekmett@gmail.com">ekmett@gmail.com</a>> writes:<br>
<br>
> Currently if you try to use a DoubleX4# and don't have AVX2 turned on, it<br>
> deliberately crashes out during code generation, no?<br>
<br>
</span>I very well be missing something, but I don't believe this is true. This<br>
program compiles just fine with merely -fllvm -msse,<br>
<br>
{-# LANGUAGE MagicHash #-}<br>
{-# LANGUAGE UnboxedTuples #-}<br>
module Hi where<br>
import GHC.Prim<br>
import GHC.Float<br>
<br>
addIt :: DoubleX4# -> DoubleX4# -> DoubleX4#<br>
addIt x y = plusDoubleX4# x y<br>
{-# NOINLINE addIt #-}<br>
<br>
It produces the following assembler,,<br>
<br>
movupd 0x10(%rbp),%xmm0<br>
movupd 0x0(%rbp),%xmm1<br>
movupd 0x30(%rbp),%xmm2<br>
movupd 0x20(%rbp),%xmm3<br>
addpd %xmm1,%xmm3<br>
addpd %xmm0,%xmm2<br>
movupd %xmm2,0x30(%rbp)<br>
movupd %xmm3,0x20(%rbp)<br>
mov 0x40(%rbp),%rax<br>
lea 0x20(%rbp),%rbp<br>
jmpq *%rax<br>
<br>
The reason for this is that the LLVM code generator just blindly<br>
translates DoubleX4# to LLVM's <4 x double> type. The LLVM code<br>
generator then does whatever it can to produce the code we ask of it,<br>
even if the target doesn't have support for this vector variety.<br>
<br>
Cheers,<br>
<br>
- Ben<br>
</blockquote></div><br></div>