The Curious Case of T6084 -or- Register Confusion with LLVM

Simon Peyton Jones simonpj at microsoft.com
Thu Sep 21 07:40:57 UTC 2017


Moritz

Talk to Kavon.  He was thinking about passing a struct instead of a huge list of registers, and only initialising the live fields of the struct.  I don't know whether he ultimately discarded the idea, but it sounded promising.

Simon

|  -----Original Message-----
|  From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of Moritz
|  Angermann
|  Sent: 20 September 2017 10:45
|  To: GHC developers <ghc-devs at haskell.org>
|  Subject: The Curious Case of T6084 -or- Register Confusion with LLVM
|  
|  Hi *,
|  
|  TLDR: The LLVM backend might confuse floating registers in GHC.
|  
|  # Demo (Ticket #14251)
|  
|  Let Demo.hs be the following short program (a minor modification from
|  T6084):
|  ```
|  {-# LANGUAGE MagicHash, BangPatterns #-} module Main where
|  
|  import GHC.Exts
|  
|  {-# NOINLINE f #-}
|  f :: (Int# -> Float# -> Double# -> Float# -> Double# -> String) -> String f
|  g = g 3# 4.0# 5.0## 6.0# 6.9## ++ " World!"
|  
|  {-# NOINLINE q #-}
|  q :: Int# -> Float# -> Double# -> Float# -> Double# -> String q i j k l m =
|  "Hello " ++ show (F# l) ++ " " ++ show (D# m)
|  
|  main = putStrLn (f $ q)
|  ```
|  
|  What happens if we compile them with the NCG and LLVM?
|  
|  $ ghc -fasm -fforce-recomp Demo.hs -O2 -o Demo-ncg && ./Demo-ncg Hello 6.0
|  6.9 World!
|  
|  $ ghc -fllvm -fforce-recomp Demo.hs -O2 -o Demo-llvm && ./Demo-llvm Hello
|  4.0 5.0 World!
|  
|  # Discussion
|  
|  What is happening here?  The LLVM backend passes the registers in arguments,
|  which are then mapped to registers via the GHC calling convention we added
|  to LLVM.
|  
|  As the LLVM backend takes off from Cmm, we produce function that always hold
|  the always live registers (on x86_64 these are: Base, Sp, Hp, R1, R2, R3,
|  R4, R5, R6, SpLim) and appends those registers that are live throughout the
|  function call: in the case of `q` this is one Float and one Double register.
|  Let’s assume these are
|  F3 and D4.  Thus the function signature we generate looks like:
|  
|  ghccc void @q(i64*, i64*, i64*, i64, i64, i64, i64, i64, i64, i64, float,
|  double)
|  
|  And expect the passed arguments to represent the following registers:
|  
|                base, sp, hp, r1, r2, r3, r4, r5, r6, spLim, f3, d4
|  
|  as we found that f1 and d1 are not live.
|  
|  Yet, when we call `q` in the form of `g` in the body of `f`. We will pass it
|  14 arguments instead of 12.  To make this “typecheck” in LLVM, we
|  
|  @q' = bitcast @q to (i64*, i64*, i64*, i64, i64, i64, i64, i64, i64, i64,
|  float, double, float, double)
|  
|  and call @q’(base, sp, hp, r1, r2, r3, r4, r5, r6, spLim, f1, d2, f3, d4).
|  
|  at this point, we now assign f3 <- f1 and d4 <- d2; while silently ignoring
|  the passed arguments f3 and d4.
|  
|  (This is where my llvmng backend fell over, as it does not bitcast function
|  signatures but tries to unify them.)
|  
|  # Solution?
|  
|  Initially, Ben and I though we could simply always pass all registers as
|  arguments in LLVM and call it a day with the downside of create more verbose
|  but correct code.  As I found out, that comes with a few complications.  For
|  some reason, all active stg registers for my machine give me
|  
|    Base, Sp, Hp, R1, R2, R3, R4, R5, R6, SpLim,
|    F1, D1, F2, D2, F3, D3,
|    XMM1,XMM2,XMM3,XMM4,XMM5,XMM6,
|    YMM1,YMM2,YMM3,YMM4,YMM5,YMM6,
|    ZMM1,ZMM2,ZMM3,ZMM4,ZMM5,ZMM6
|  
|  I should not have the YMM*, and ZMM* registers as I don’t have any AVX nor
|  AVX512; that looks like only a patch away.  However we try to optimize our
|  register, such that we can pass up to six doubles or six floats or any
|  combination of both if needed in registers, without having to allocate them
|  on the stack, by assuming overlapping registers (See Note [Overlapping
|  global registers]).
|  
|  And as such a full function signature in LLVM would as opposed to one that’s
|  based on the “live” registers as we have right now, would consist of 12
|  float/double registers, and LLVM only maps 6.  My current idea is to, pass
|  only the explicit F1,D1,…,F3,D3 and try to disable the register overlapping
|  for LLVM.  This would probably force more floating values to be stack
|  allocated rather than passed via registers, but would likely guarantee that
|  the registers match up.  The other option I can think of is to define some
|  viertual generic floating registers in the llvm code gen: V1,…,V6 and then
|  perform something like
|  
|    F1 <- V1 as float
|    D1 <- V1 as double
|  
|  in the body of the function, while trying to use the `live` information at
|  the call site to decide which of F1 or D1 to pass as V1.
|  
|  Ideas?
|  
|  Cheers,
|   Moritz
|  
|  _______________________________________________
|  ghc-devs mailing list
|  ghc-devs at haskell.org
|  https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmail.haskell
|  .org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fghc-
|  devs&data=02%7C01%7Csimonpj%40microsoft.com%7C89f152a90a1b43caa39408d5000c4b
|  8c%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636414975169335863&sdata=Dda
|  hjmHVAKIaK3YVrmX7lS8s3OswoeLoP5sDRV060eE%3D&reserved=0


More information about the ghc-devs mailing list