[GHC] #12736: Calling a complex Haskell function (obtained via FFI wrapper function) from MSVC 64-bit C code (passed in as FunPtr) can leave SSE2 registers in the XMM6-XMM15 range modified

Tue Oct 18 21:41:22 UTC 2016

#12736: Calling a complex Haskell function (obtained via FFI wrapper function) from
MSVC 64-bit C code (passed in as FunPtr) can leave SSE2 registers in the
XMM6-XMM15 range modified
-------------------------------------+-------------------------------------
           Reporter:  bavism         |             Owner:
               Type:  bug            |            Status:  new
           Priority:  normal         |         Milestone:
          Component:  Compiler       |           Version:  7.10.3
  (FFI)                              |
           Keywords:                 |  Operating System:  Windows
  ffi,registers,sse2,clobber,xmm     |
       Architecture:  x86_64         |   Type of failure:  Incorrect result
  (amd64)                            |  at runtime
          Test Case:                 |        Blocked By:
  https://github.com/bavis-m/raycast |
           Blocking:                 |   Related Tickets:
Differential Rev(s):                 |         Wiki Page:
-------------------------------------+-------------------------------------
 According to the [https://msdn.microsoft.com/en-us/library/9z1stfyw.aspx
 MSDN], in the Microsoft x64 architecture function calls must preserve the
 SSE2 registers in the range XMM6-XMM15. The Haskell FFI can produce a
 function pointer via dynamic wrapper that, when called from MSVC x64 C
 code, does not preserve these registers, causing further floating-point
 operations in the C code to fail.

 I can reproduce this error in [https://github.com/bavis-m/raycast this
 project], which is a DOOM-style raycasting engine written in Haskell, that
 imports a C DLL with glue for rendering and window management. The Haskell
 executable generates a FunPtr to a frame update function using the dynamic
 import mechanism, and passes this to a long-lived C function that runs the
 update loop. Any time this update function is called from the C loop,
 subsequent floating point operations produce incorrect results (in this
 case, the next operations compute a view matrix for the OpenGL window).

 The output on every frame showing the view matrix should be:
 {{{
 viewM: 0.003125 0.000000 0.000000 -1.000000
        0.000000 0.004167 0.000000 -1.000000
        0.000000 0.000000 1.000000 0.000000
        0.000000 0.000000 0.000000 1.000000
 }}}

 Running the raycaster with the Release version of the DLL causes the value
 of this matrix to be corrupted. There is a patch provided (stub.patch in
 the root folder) that turns the Haskell update function into an empty
 stub. This causes the program to work. When stepping through the assembly
 code with this patch applied, I can see in the function prologue where the
 XMM registers are saved. Without the patch, these registers are not saved.
 Running the Debug version does not show this error; the register
 allocation must be different.

 I have been attempting to create a much simpler test case to reveal this
 code-generation issue, however it has been difficult. Even seemingly
 trivial changes can cause the bug to not show up, it is clearly dependent
 on the register allocation used internally to produce the assembly code.

 Instructions for building the project are in the readme. (You will need
 the Haskell Stack Tool, and Visual Studio 15).

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/12736>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler