[Git][ghc/ghc][wip/supersven/riscv64-ncg] 16 commits: Trim trailing whitespace

Sven Tennie (@supersven) gitlab at gitlab.haskell.org
Wed Mar 6 18:56:36 UTC 2024



Sven Tennie pushed to branch wip/supersven/riscv64-ncg at Glasgow Haskell Compiler / GHC


Commits:
625493a4 by Sven Tennie at 2024-03-06T07:02:17+01:00
Trim trailing whitespace

- - - - -
30dfe442 by Sven Tennie at 2024-03-06T08:09:57+01:00
No sign extension attribute in RISCV instructions

RISCV takes the meaning of reduced instruction set much more serious:
One cannot sign extend an operant in the same instruction. Delete this
unused code - We're handling sign extension differently.

- - - - -
811be0cb by Sven Tennie at 2024-03-06T08:45:23+01:00
There are no shifts in RISCV instructions

Remove this unused code. RISCV does everything in small steps...

- - - - -
c50c232f by Sven Tennie at 2024-03-06T08:56:52+01:00
pprReg: Remove unused parameter

- - - - -
ae94eadd by Sven Tennie at 2024-03-06T10:21:25+01:00
Cleanup Cond

- - - - -
50dc60b3 by Sven Tennie at 2024-03-06T12:29:44+01:00
More cleanup

Mostly haddock, formatting, minor refactorings

- - - - -
4317a296 by Sven Tennie at 2024-03-06T14:47:23+01:00
Implement takeRegRegMoveInstr

- - - - -
6f258fb4 by Sven Tennie at 2024-03-06T15:17:01+01:00
Cleanup mkStackAllocInstr & mkStackDeallocInstr

- - - - -
f641e4fa by Sven Tennie at 2024-03-06T16:59:18+01:00
Move Reg definitions to Regs module

- - - - -
4a3cd3e2 by Sven Tennie at 2024-03-06T17:00:12+01:00
Typo

- - - - -
73195089 by Sven Tennie at 2024-03-06T17:00:37+01:00
Circumvent "incomplete pattern match" warning

- - - - -
d604f115 by Sven Tennie at 2024-03-06T17:01:16+01:00
Delete commented out / dead code

- - - - -
970df83d by Sven Tennie at 2024-03-06T17:07:54+01:00
Add TODOs

- - - - -
f6d1a418 by Sven Tennie at 2024-03-06T18:31:52+01:00
Reduce duplication in stack alloc / free

- - - - -
8d077038 by Sven Tennie at 2024-03-06T18:52:04+01:00
Delete unused instructions

As they are unused, we don't even know if using them would work at all.

- - - - -
f6f310ee by Sven Tennie at 2024-03-06T18:54:41+01:00
Lint

- - - - -


5 changed files:

- compiler/GHC/CmmToAsm/RV64/CodeGen.hs
- compiler/GHC/CmmToAsm/RV64/Cond.hs
- compiler/GHC/CmmToAsm/RV64/Instr.hs
- compiler/GHC/CmmToAsm/RV64/Ppr.hs
- compiler/GHC/CmmToAsm/RV64/Regs.hs


Changes:

=====================================
compiler/GHC/CmmToAsm/RV64/CodeGen.hs
=====================================
@@ -911,10 +911,10 @@ getRegister' config plat expr =
         -- SLT is the same. ULE, and ULT will not return true for NaN.
         -- This is a bit counter-intuitive. Don't let yourself be fooled by
         -- the S/U prefix for floats, it's only meaningful for integers.
-        MO_F_Ge w    -> floatCond w (\d x y -> unitOL $ annExpr expr (CSET d x y OGE))
-        MO_F_Le w    -> floatCond w (\d x y -> unitOL $ annExpr expr (CSET d x y OLE)) -- x <= y <=> y > x
-        MO_F_Gt w    -> floatCond w (\d x y -> unitOL $ annExpr expr (CSET d x y OGT))
-        MO_F_Lt w    -> floatCond w (\d x y -> unitOL $ annExpr expr (CSET d x y OLT)) -- x < y <=> y >= x
+        MO_F_Ge w    -> floatCond w (\d x y -> unitOL $ annExpr expr (CSET d x y FGE))
+        MO_F_Le w    -> floatCond w (\d x y -> unitOL $ annExpr expr (CSET d x y FLE)) -- x <= y <=> y > x
+        MO_F_Gt w    -> floatCond w (\d x y -> unitOL $ annExpr expr (CSET d x y FGT))
+        MO_F_Lt w    -> floatCond w (\d x y -> unitOL $ annExpr expr (CSET d x y FLT)) -- x < y <=> y >= x
 
         -- Bitwise operations
         MO_And   w -> bitOp w (\d x y -> unitOL $ annExpr expr (AND d x y))
@@ -1278,10 +1278,10 @@ genCondJump bid expr = do
           MO_F_Eq w -> fbcond w EQ
           MO_F_Ne w -> fbcond w NE
 
-          MO_F_Gt w -> fbcond w OGT
-          MO_F_Ge w -> fbcond w OGE
-          MO_F_Lt w -> fbcond w OLT
-          MO_F_Le w -> fbcond w OLE
+          MO_F_Gt w -> fbcond w FGT
+          MO_F_Ge w -> fbcond w FGE
+          MO_F_Lt w -> fbcond w FLT
+          MO_F_Le w -> fbcond w FLE
 
           MO_Eq w   -> sbcond w EQ
           MO_Ne w   -> sbcond w NE


=====================================
compiler/GHC/CmmToAsm/RV64/Cond.hs
=====================================
@@ -1,93 +1,58 @@
-module GHC.CmmToAsm.RV64.Cond  where
+module GHC.CmmToAsm.RV64.Cond where
 
 import GHC.Prelude hiding (EQ)
 
--- FIXME: These conditions originate from the Aarch64 backend.  I'm not even sure
--- we use all of them there. For RISCV we need to synthesize some of them, as
--- RISCV has a much more reduced (ha!) set of condtionals.
-
--- TODO: This appears to go a bit overboard? Maybe we should stick with what LLVM
--- settled on for fcmp?
--- false: always yields false, regardless of operands.
--- oeq: yields true if both operands are not a QNAN and op1 is equal to op2.
--- ogt: yields true if both operands are not a QNAN and op1 is greater than op2.
--- oge: yields true if both operands are not a QNAN and op1 is greater than or equal to op2.
--- olt: yields true if both operands are not a QNAN and op1 is less than op2.
--- ole: yields true if both operands are not a QNAN and op1 is less than or equal to op2.
--- one: yields true if both operands are not a QNAN and op1 is not equal to op2.
--- ord: yields true if both operands are not a QNAN.
--- ueq: yields true if either operand is a QNAN or op1 is equal to op2.
--- ugt: yields true if either operand is a QNAN or op1 is greater than op2.
--- uge: yields true if either operand is a QNAN or op1 is greater than or equal to op2.
--- ult: yields true if either operand is a QNAN or op1 is less than op2.
--- ule: yields true if either operand is a QNAN or op1 is less than or equal to op2.
--- une: yields true if either operand is a QNAN or op1 is not equal to op2.
--- uno: yields true if either operand is a QNAN.
--- true: always yields true, regardless of operands.
+-- | Condition codes.
 --
--- LLVMs icmp knows about:
--- eq: yields true if the operands are equal, false otherwise. No sign interpretation is necessary or performed.
--- ne: yields true if the operands are unequal, false otherwise. No sign interpretation is necessary or performed.
--- ugt: interprets the operands as unsigned values and yields true if op1 is greater than op2.
--- uge: interprets the operands as unsigned values and yields true if op1 is greater than or equal to op2.
--- ult: interprets the operands as unsigned values and yields true if op1 is less than op2.
--- ule: interprets the operands as unsigned values and yields true if op1 is less than or equal to op2.
--- sgt: interprets the operands as signed values and yields true if op1 is greater than op2.
--- sge: interprets the operands as signed values and yields true if op1 is greater than or equal to op2.
--- slt: interprets the operands as signed values and yields true if op1 is less than op2.
--- sle: interprets the operands as signed values and yields true if op1 is less than or equal to op2.
-
+-- Used in conditional branches and bit setters. According to the available
+-- instruction set, some conditions are encoded as their negated opposites. I.e.
+-- these are logical things that don't necessarily map 1:1 to hardware/ISA.
 data Cond
-    = ALWAYS -- b.al
-    | EQ     -- b.eq
-    | NE     -- b.ne
-    -- signed
-    | SLT    -- b.lt
-    | SLE    -- b.le
-    | SGE    -- b.ge
-    | SGT    -- b.gt
-    -- unsigned
-    | ULT    -- b.lo
-    | ULE    -- b.ls
-    | UGE    -- b.hs
-    | UGT    -- b.hi
-    -- ordered
-    | OLT    -- b.mi
-    | OLE    -- b.ls
-    | OGE    -- b.ge
-    | OGT    -- b.gt
-    -- unordered
-    | UOLT   -- b.lt
-    | UOLE   -- b.le
-    | UOGE   -- b.pl
-    | UOGT   -- b.hi
-    -- others
-    | NEVER  -- b.nv
-    | VS     -- oVerflow set
-    | VC     -- oVerflow clear
-    deriving (Eq, Show)
+  = -- | int and float
+    EQ
+  | -- | int and float
+    NE
+  | -- | signed less than
+    SLT
+  | -- | signed less than or equal
+    SLE
+  | -- | signed greater than or equal
+    SGE
+  | -- | signed greater than
+    SGT
+  | -- | unsigned less than
+    ULT
+  | -- | unsigned less than or equal
+    ULE
+  | -- | unsigned greater than or equal
+    UGE
+  | -- | unsigned greater than
+    UGT
+  | -- | floating point instruction @flt@
+    FLT
+  | -- | floating point instruction @fle@
+    FLE
+  | -- | floating point instruction @fge@
+    FGE
+  | -- | floating point instruction @fgt@
+    FGT
+  deriving (Eq, Show)
 
 -- | Negate a condition.
+--
+-- This is useful to e.g. construct far branches from usual branches.
 negateCond :: Cond -> Cond
-negateCond ALWAYS = NEVER
-negateCond NEVER  = ALWAYS
-negateCond EQ     = NE
-negateCond NE     = EQ
-negateCond SLT    = SGE
-negateCond SLE    = SGT
-negateCond SGE    = SLT
-negateCond SGT    = SLE
-negateCond ULT    = UGE
-negateCond ULE    = UGT
-negateCond UGE    = ULT
-negateCond UGT    = ULE
-negateCond OLT    = OGE
-negateCond OLE    = OGT
-negateCond OGE    = OLT
-negateCond OGT    = OLE
-negateCond UOLT   = UOGE
-negateCond UOLE   = UOGT
-negateCond UOGE   = UOLT
-negateCond UOGT   = UOLE
-negateCond VS     = VC
-negateCond VC     = VS
+negateCond EQ = NE
+negateCond NE = EQ
+negateCond SLT = SGE
+negateCond SLE = SGT
+negateCond SGE = SLT
+negateCond SGT = SLE
+negateCond ULT = UGE
+negateCond ULE = UGT
+negateCond UGE = ULT
+negateCond UGT = ULE
+negateCond FLT = FGE
+negateCond FLE = FGT
+negateCond FGE = FLT
+negateCond FGT = FLE


=====================================
compiler/GHC/CmmToAsm/RV64/Instr.hs
=====================================
@@ -28,7 +28,7 @@ import GHC.Types.Unique.Supply
 
 import GHC.Utils.Panic
 
-import Data.Maybe (fromMaybe)
+import Data.Maybe
 
 import GHC.Stack
 import qualified Data.List.NonEmpty as NE
@@ -92,7 +92,6 @@ regUsageOfInstr platform instr = case instr of
   MUL dst src1 src2        -> usage (regOp src1 ++ regOp src2, regOp dst)
   NEG dst src              -> usage (regOp src, regOp dst)
   SMULH dst src1 src2      -> usage (regOp src1 ++ regOp src2, regOp dst)
-  SMULL dst src1 src2      -> usage (regOp src1 ++ regOp src2, regOp dst)
   DIV dst src1 src2        -> usage (regOp src1 ++ regOp src2, regOp dst)
   REM dst src1 src2        -> usage (regOp src1 ++ regOp src2, regOp dst)
   REMU dst src1 src2       -> usage (regOp src1 ++ regOp src2, regOp dst)
@@ -113,12 +112,9 @@ regUsageOfInstr platform instr = case instr of
   LSL dst src1 src2        -> usage (regOp src1 ++ regOp src2, regOp dst)
   LSR dst src1 src2        -> usage (regOp src1 ++ regOp src2, regOp dst)
   MOV dst src              -> usage (regOp src, regOp dst)
-  MOVK dst src             -> usage (regOp src, regOp dst)
   -- ORI's third operand is always an immediate
   ORI dst src1 _           -> usage (regOp src1, regOp dst)
   XORI dst src1 _          -> usage (regOp src1, regOp dst)
-  ROR dst src1 src2        -> usage (regOp src1 ++ regOp src2, regOp dst)
-  TST src1 src2            -> usage (regOp src1 ++ regOp src2, [])
   -- 4. Branch Instructions ----------------------------------------------------
   J t                      -> usage (regTarget t, [])
   B t                      -> usage (regTarget t, [])
@@ -130,8 +126,6 @@ regUsageOfInstr platform instr = case instr of
   -- 5. Atomic Instructions ----------------------------------------------------
   -- 6. Conditional Instructions -----------------------------------------------
   CSET dst l r _           -> usage (regOp l ++ regOp r, regOp dst)
-  CBZ src _                -> usage (regOp src, [])
-  CBNZ src _               -> usage (regOp src, [])
   -- 7. Load and Store Instructions --------------------------------------------
   STR _ src dst            -> usage (regOp src ++ regOp dst, [])
   -- STLR _ src dst      L     -> usage (regOp src ++ regOp dst, [])
@@ -165,11 +159,8 @@ regUsageOfInstr platform instr = case instr of
         regAddr (AddrReg r1)       = [r1]
         regOp :: Operand -> [Reg]
         regOp (OpReg _ r1) = [r1]
-        regOp (OpRegExt _ r1 _ _) = [r1]
-        regOp (OpRegShift _ r1 _ _) = [r1]
         regOp (OpAddr a) = regAddr a
         regOp (OpImm _) = []
-        regOp (OpImmShift _ _ _) = []
         regTarget :: Target -> [Reg]
         regTarget (TBlock _) = []
         regTarget (TLabel _) = []
@@ -229,7 +220,6 @@ patchRegsOfInstr instr env = case instr of
     MUL o1 o2 o3   -> MUL (patchOp o1) (patchOp o2) (patchOp o3)
     NEG o1 o2      -> NEG (patchOp o1) (patchOp o2)
     SMULH o1 o2 o3 -> SMULH (patchOp o1) (patchOp o2)  (patchOp o3)
-    SMULL o1 o2 o3 -> SMULL (patchOp o1) (patchOp o2)  (patchOp o3)
     DIV o1 o2 o3   -> DIV (patchOp o1) (patchOp o2) (patchOp o3)
     REM o1 o2 o3   -> REM (patchOp o1) (patchOp o2) (patchOp o3)
     REMU o1 o2 o3  -> REMU (patchOp o1) (patchOp o2) (patchOp o3)
@@ -252,12 +242,9 @@ patchRegsOfInstr instr env = case instr of
     LSL o1 o2 o3   -> LSL  (patchOp o1) (patchOp o2) (patchOp o3)
     LSR o1 o2 o3   -> LSR  (patchOp o1) (patchOp o2) (patchOp o3)
     MOV o1 o2      -> MOV  (patchOp o1) (patchOp o2)
-    MOVK o1 o2     -> MOVK (patchOp o1) (patchOp o2)
     -- o3 cannot be a register for ORI (always an immediate)
     ORI o1 o2 o3   -> ORI  (patchOp o1) (patchOp o2) (patchOp o3)
     XORI o1 o2 o3  -> XORI  (patchOp o1) (patchOp o2) (patchOp o3)
-    ROR o1 o2 o3   -> ROR  (patchOp o1) (patchOp o2) (patchOp o3)
-    TST o1 o2      -> TST  (patchOp o1) (patchOp o2)
 
     -- 4. Branch Instructions --------------------------------------------------
     J t            -> J (patchTarget t)
@@ -270,8 +257,6 @@ patchRegsOfInstr instr env = case instr of
     -- 5. Atomic Instructions --------------------------------------------------
     -- 6. Conditional Instructions ---------------------------------------------
     CSET o l r c   -> CSET (patchOp o) (patchOp l) (patchOp r) c
-    CBZ o l        -> CBZ (patchOp o) l
-    CBNZ o l       -> CBNZ (patchOp o) l
     -- 7. Load and Store Instructions ------------------------------------------
     STR f o1 o2    -> STR f (patchOp o1) (patchOp o2)
     -- STLR f o1 o2   -> STLR f (patchOp o1) (patchOp o2)
@@ -293,8 +278,6 @@ patchRegsOfInstr instr env = case instr of
     where
         patchOp :: Operand -> Operand
         patchOp (OpReg w r) = OpReg w (env r)
-        patchOp (OpRegExt w r x s) = OpRegExt w (env r) x s
-        patchOp (OpRegShift w r m s) = OpRegShift w (env r) m s
         patchOp (OpAddr a) = OpAddr (patchAddr a)
         patchOp op = op
         patchTarget :: Target -> Target
@@ -304,53 +287,48 @@ patchRegsOfInstr instr env = case instr of
         patchAddr (AddrRegImm r1 i)  = AddrRegImm (env r1) i
         patchAddr (AddrReg r) = AddrReg (env r)
 --------------------------------------------------------------------------------
+
 -- | Checks whether this instruction is a jump/branch instruction.
+--
 -- One that can change the flow of control in a way that the
 -- register allocator needs to worry about.
 isJumpishInstr :: Instr -> Bool
 isJumpishInstr instr = case instr of
-    ANN _ i -> isJumpishInstr i
-    CBZ{} -> True
-    CBNZ{} -> True
-    J{} -> True
-    B{} -> True
-    B_FAR{} -> True
-    BL{} -> True
-    BCOND{} -> True
-    BCOND_FAR{} -> True
-    _ -> False
-
--- | Checks whether this instruction is a jump/branch instruction.
--- One that can change the flow of control in a way that the
--- register allocator needs to worry about.
+  ANN _ i -> isJumpishInstr i
+  J {} -> True
+  B {} -> True
+  B_FAR {} -> True
+  BL {} -> True
+  BCOND {} -> True
+  BCOND_FAR {} -> True
+  _ -> False
+
+-- | Get the `BlockId`s of the jump destinations (if any)
 jumpDestsOfInstr :: Instr -> [BlockId]
 jumpDestsOfInstr (ANN _ i) = jumpDestsOfInstr i
-jumpDestsOfInstr (CBZ _ t) = [ id | TBlock id <- [t]]
-jumpDestsOfInstr (CBNZ _ t) = [ id | TBlock id <- [t]]
 jumpDestsOfInstr (J t) = [id | TBlock id <- [t]]
 jumpDestsOfInstr (B t) = [id | TBlock id <- [t]]
 jumpDestsOfInstr (B_FAR t) = [t]
-jumpDestsOfInstr (BL t _ _) = [ id | TBlock id <- [t]]
-jumpDestsOfInstr (BCOND _ _ _ t) = [ id | TBlock id <- [t]]
-jumpDestsOfInstr (BCOND_FAR _ _ _ _ t) = [ id | TBlock id <- [t]]
+jumpDestsOfInstr (BL t _ _) = [id | TBlock id <- [t]]
+jumpDestsOfInstr (BCOND _ _ _ t) = [id | TBlock id <- [t]]
+jumpDestsOfInstr (BCOND_FAR _ _ _ _ t) = [id | TBlock id <- [t]]
 jumpDestsOfInstr _ = []
 
--- | Change the destination of this jump instruction.
+-- | Change the destination of this (potential) jump instruction.
+--
 -- Used in the linear allocator when adding fixup blocks for join
 -- points.
 patchJumpInstr :: Instr -> (BlockId -> BlockId) -> Instr
-patchJumpInstr instr patchF
-    = case instr of
-        ANN d i -> ANN d (patchJumpInstr i patchF)
-        CBZ r (TBlock bid) -> CBZ r (TBlock (patchF bid))
-        CBNZ r (TBlock bid) -> CBNZ r (TBlock (patchF bid))
-        J (TBlock bid) -> J (TBlock (patchF bid))
-        B (TBlock bid) -> B (TBlock (patchF bid))
-        B_FAR bid -> B_FAR (patchF bid)
-        BL (TBlock bid) ps rs -> BL (TBlock (patchF bid)) ps rs
-        BCOND c o1 o2 (TBlock bid) -> BCOND c o1 o2 (TBlock (patchF bid))
-        BCOND_FAR c o1 o2 b (TBlock bid) -> BCOND_FAR c o1 o2 b (TBlock (patchF bid))
-        _ -> panic $ "patchJumpInstr: " ++ instrCon instr
+patchJumpInstr instr patchF =
+  case instr of
+    ANN d i -> ANN d (patchJumpInstr i patchF)
+    J (TBlock bid) -> J (TBlock (patchF bid))
+    B (TBlock bid) -> B (TBlock (patchF bid))
+    B_FAR bid -> B_FAR (patchF bid)
+    BL (TBlock bid) ps rs -> BL (TBlock (patchF bid)) ps rs
+    BCOND c o1 o2 (TBlock bid) -> BCOND c o1 o2 (TBlock (patchF bid))
+    BCOND_FAR c o1 o2 b (TBlock bid) -> BCOND_FAR c o1 o2 b (TBlock (patchF bid))
+    _ -> panic $ "patchJumpInstr: " ++ instrCon instr
 
 -- -----------------------------------------------------------------------------
 -- Note [Spills and Reloads]
@@ -372,11 +350,11 @@ patchJumpInstr instr patchF
 mkSpillInstr ::
   HasCallStack =>
   NCGConfig ->
-  Reg -> -- register to spill
-  Int -> -- current stack delta
-  Int -> -- spill slot to use
+  Reg -> -- ^ register to spill
+  Int -> -- ^ current stack delta
+  Int -> -- ^ spill slot to use
   [Instr]
-mkSpillInstr config reg delta slot =
+mkSpillInstr _config reg delta slot =
   case off - delta of
     imm | fitsIn12bitImm imm -> [mkStrSpImm imm]
     imm ->
@@ -397,9 +375,9 @@ mkSpillInstr config reg delta slot =
 
 mkLoadInstr
    :: NCGConfig
-   -> Reg       -- register to load
-   -> Int       -- current stack delta
-   -> Int       -- spill slot to use
+   -> Reg       -- ^ register to load
+   -> Int       -- ^ current stack delta
+   -> Int       -- ^ spill slot to use
    -> [Instr]
 
 mkLoadInstr _config reg delta slot =
@@ -421,58 +399,77 @@ mkLoadInstr _config reg delta slot =
 
     off = spillSlotToOffset slot
 
-  --------------------------------------------------------------------------------
 -- | See if this instruction is telling us the current C stack delta
 takeDeltaInstr :: Instr -> Maybe Int
 takeDeltaInstr (ANN _ i) = takeDeltaInstr i
 takeDeltaInstr (DELTA i) = Just i
 takeDeltaInstr _         = Nothing
 
--- Not real instructions.  Just meta data
+-- | Not real instructions.  Just meta data
 isMetaInstr :: Instr -> Bool
-isMetaInstr instr
- = case instr of
-    ANN _ i     -> isMetaInstr i
-    COMMENT{}   -> True
-    MULTILINE_COMMENT{} -> True
-    LOCATION{}  -> True
-    LDATA{}     -> True
-    NEWBLOCK{}  -> True
-    DELTA{}     -> True
+isMetaInstr instr =
+  case instr of
+    ANN _ i -> isMetaInstr i
+    COMMENT {} -> True
+    MULTILINE_COMMENT {} -> True
+    LOCATION {} -> True
+    LDATA {} -> True
+    NEWBLOCK {} -> True
+    DELTA {} -> True
     PUSH_STACK_FRAME -> True
     POP_STACK_FRAME -> True
-    _           -> False
+    _ -> False
 
 -- | Copy the value in a register to another one.
+--
 -- Must work for all register classes.
 mkRegRegMoveInstr :: Reg -> Reg -> Instr
-mkRegRegMoveInstr src dst = ANN (text "Reg->Reg Move: " <> ppr src <> text " -> " <> ppr dst) $ MOV (OpReg W64 dst) (OpReg W64 src)
+mkRegRegMoveInstr src dst = ANN desc instr
+  where
+    desc = text "Reg->Reg Move: " <> ppr src <> text " -> " <> ppr dst
+    instr = MOV (operandFromReg dst) (operandFromReg src)
 
--- | Take the source and destination from this reg -> reg move instruction
--- or Nothing if it's not one
+-- | Take the source and destination from this (potential) reg -> reg move instruction
+--
+-- We have to be a bit careful here: A `MOV` can also mean an implicit
+-- conversion. This case is filtered out.
 takeRegRegMoveInstr :: Instr -> Maybe (Reg,Reg)
---takeRegRegMoveInstr (MOV (OpReg fmt dst) (OpReg fmt' src)) | fmt == fmt' = Just (src, dst)
+takeRegRegMoveInstr (MOV (OpReg fmt dst) (OpReg fmt' src)) | fmt == fmt' = pure (src, dst)
 takeRegRegMoveInstr _ = Nothing
 
 -- | Make an unconditional jump instruction.
 mkJumpInstr :: BlockId -> [Instr]
-mkJumpInstr id = [B (TBlock id)]
+mkJumpInstr = pure . B . TBlock
 
+-- | Decrement @sp@ to allocate stack space.
+--
+-- The stack grows downwards, so we decrement the stack pointer by @n@ (bytes).
+-- This is dual to `mkStackDeallocInstr`. @sp@ is the RISCV stack pointer, not
+-- to be confused with the STG stack pointer.
 mkStackAllocInstr :: Platform -> Int -> [Instr]
-mkStackAllocInstr platform n
-    | n == 0 = []
-    | n > 0 && fitsIn12bitImm n = [ ANN (text "Alloc More Stack") $ SUB sp sp (OpImm (ImmInt n)) ]
-    -- TODO: This case may be optimized with the IP register for large n-s
-    | n > 0 =  ANN (text "Alloc More Stack") (SUB sp sp (OpImm (ImmInt intMax12bit))) : mkStackAllocInstr platform (n - intMax12bit)
-mkStackAllocInstr _platform n = pprPanic "mkStackAllocInstr" (int n)
+mkStackAllocInstr _platform = moveSp . negate
 
+-- | Increment SP to deallocate stack space.
+--
+-- The stack grows downwards, so we increment the stack pointer by @n@ (bytes).
+-- This is dual to `mkStackAllocInstr`. @sp@ is the RISCV stack pointer, not to
+-- be confused with the STG stack pointer.
 mkStackDeallocInstr :: Platform -> Int -> [Instr]
-mkStackDeallocInstr platform n
-    | n == 0 = []
-    | n > 0 && fitsIn12bitImm n = [ ANN (text "Dealloc More Stack") $ ADD sp sp (OpImm (ImmInt n)) ]
-    -- TODO: This case may be optimized with the IP register for large n-s
-    | n > 0 =  ANN (text "Dealloc More Stack") (ADD sp sp (OpImm (ImmInt intMax12bit))) : mkStackDeallocInstr platform (n - intMax12bit)
-mkStackDeallocInstr _platform n = pprPanic "mkStackDeallocInstr" (int n)
+mkStackDeallocInstr _platform = moveSp
+
+moveSp :: Int -> [Instr]
+moveSp n
+  | n == 0 = []
+  | n /= 0 && fitsIn12bitImm n = pure . ANN desc $ ADD sp sp (OpImm (ImmInt n))
+  | otherwise =
+      -- This ends up in three effective instructions. We could get away with
+      -- two for intMax12bit < n < 3 * intMax12bit by recursing once. However,
+      -- this way is likely less surprising.
+      [ ANN desc (MOV ip (OpImm (ImmInt n))),
+        ADD sp sp ip
+      ]
+  where
+    desc = text "Move SP:" <+> int n
 
 --
 -- See Note [extra spill slots] in X86/Instr.hs
@@ -496,7 +493,7 @@ allocMoreStack platform slots proc@(CmmProc info lbl live (ListGraph code)) = do
       alloc   = mkStackAllocInstr   platform delta
       dealloc = mkStackDeallocInstr platform delta
 
-      retargetList = (zip entries (map mkBlockId uniqs))
+      retargetList = zip entries (map mkBlockId uniqs)
 
       new_blockmap :: LabelMap BlockId
       new_blockmap = mapFromList retargetList
@@ -520,8 +517,8 @@ allocMoreStack platform slots proc@(CmmProc info lbl live (ListGraph code)) = do
         where retarget b = fromMaybe b (mapLookup b new_blockmap)
 
       new_code = concatMap insert_stack_insn code
-    -- in
     return (CmmProc info lbl live (ListGraph new_code), retargetList)
+
 -- -----------------------------------------------------------------------------
 -- Machine's assembly language
 
@@ -623,8 +620,6 @@ data Instr
 
     -- TODO: Rename: MULH
     | SMULH Operand Operand Operand
-    | SMULL Operand Operand Operand
-
     | DIVU Operand Operand Operand -- rd = rn ÷ rm
 
     -- 2. Bit Manipulation Instructions ----------------------------------------
@@ -637,20 +632,16 @@ data Instr
     -- | AND Operand Operand Operand -- rd = rn & op2
     -- | ANDS Operand Operand Operand -- rd = rn & op2
     -- | ASR Operand Operand Operand -- rd = rn ≫ rm  or  rd = rn ≫ #i, i is 6 bits
+    -- TODO: unused
     | BIC Operand Operand Operand -- rd = rn & ~op2
+    -- TODO: unused
     | BICS Operand Operand Operand -- rd = rn & ~op2
     | XOR Operand Operand Operand -- rd = rn ⊕ op2
     -- | LSL Operand Operand Operand -- rd = rn ≪ rm  or rd = rn ≪ #i, i is 6 bits
     -- | LSR Operand Operand Operand -- rd = rn ≫ rm  or rd = rn ≫ #i, i is 6 bits
     | MOV Operand Operand -- rd = rn  or  rd = #i
-    | MOVK Operand Operand
-    -- | MOVN Operand Operand
-    -- | MOVZ Operand Operand
-    | ORN Operand Operand Operand -- rd = rn | ~op2
     | ORI Operand Operand Operand -- rd = rn | op2
     | XORI Operand Operand Operand -- rd = rn `xor` imm
-    | ROR Operand Operand Operand -- rd = rn ≫ rm  or  rd = rn ≫ #i, i is 6 bits
-    | TST Operand Operand -- rn & op2
     -- Load and stores.
     -- TODO STR/LDR might want to change to STP/LDP with XZR for the second register.
     -- | STR Format Operand Operand -- str Xn, address-mode // Xn -> *addr
@@ -664,10 +655,6 @@ data Instr
     -- This is a synthetic operation.
     | CSET Operand Operand Operand Cond   -- if(o2 cond o3) op <- 1 else op <- 0
 
-    -- TODO: Unused
-    | CBZ Operand Target  -- if op == 0, then branch.
-    -- TODO: Unused
-    | CBNZ Operand Target -- if op /= 0, then branch.
     -- Branching.
     -- TODO: Unused
     | J Target            -- like B, but only generated from genJump. Used to distinguish genJumps from others.
@@ -707,22 +694,18 @@ instrCon i =
       POP_STACK_FRAME{} -> "POP_STACK_FRAME"
       ADD{} -> "ADD"
       OR{} -> "OR"
-      -- CMN{} -> "CMN"
-      -- CMP{} -> "CMP"
       MUL{} -> "MUL"
       NEG{} -> "NEG"
       DIV{} -> "DIV"
       REM{} -> "REM"
       REMU{} -> "REMU"
       SMULH{} -> "SMULH"
-      SMULL{} -> "SMULL"
       SUB{} -> "SUB"
       DIVU{} -> "DIVU"
       SBFM{} -> "SBFM"
       UBFM{} -> "UBFM"
       UBFX{} -> "UBFX"
       AND{} -> "AND"
-      -- ANDS{} -> "ANDS"
       ASR{} -> "ASR"
       BIC{} -> "BIC"
       BICS{} -> "BICS"
@@ -730,22 +713,12 @@ instrCon i =
       LSL{} -> "LSL"
       LSR{} -> "LSR"
       MOV{} -> "MOV"
-      MOVK{} -> "MOVK"
-      ORN{} -> "ORN"
       ORI{} -> "ORI"
       XORI{} -> "ORI"
-      ROR{} -> "ROR"
-      TST{} -> "TST"
       STR{} -> "STR"
-      -- STLR{} -> "STLR"
       LDR{} -> "LDR"
       LDRU{} -> "LDRU"
-      -- LDAR{} -> "LDAR"
-      -- STP{} -> "STP"
-      -- LDP{} -> "LDP"
       CSET{} -> "CSET"
-      CBZ{} -> "CBZ"
-      CBNZ{} -> "CBNZ"
       J{} -> "J"
       B{} -> "B"
       B_FAR{} -> "B_FAR"
@@ -764,60 +737,12 @@ data Target
     | TLabel CLabel
     | TReg   Reg
 
-
--- Extension
--- {Unsigned|Signed}XT{Byte|Half|Word|Doube}
-data ExtMode
-    = EUXTB | EUXTH | EUXTW | EUXTX
-    | ESXTB | ESXTH | ESXTW | ESXTX
-    deriving (Eq, Show)
-
-data ShiftMode
-    = SLSL | SLSR | SASR | SROR
-    deriving (Eq, Show)
-
-
--- We can also add ExtShift to Extension.
--- However at most 3bits.
-type ExtShift = Int
--- at most 6bits
-type RegShift = Int
-
 data Operand
         = OpReg Width Reg            -- register
-        | OpRegExt Width Reg ExtMode ExtShift -- rm, <ext>[, <shift left>]
-        | OpRegShift Width Reg ShiftMode RegShift     -- rm, <shift>, <0-64>
         | OpImm Imm            -- immediate value
-        -- TODO: Does OpImmShift exist in RV64?
-        | OpImmShift Imm ShiftMode RegShift
         | OpAddr AddrMode       -- memory reference
         deriving (Eq, Show)
 
--- Note [The made-up RISCV64 IP register]
--- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
---
--- RISCV64 has no inter-procedural register in its ABI. However, we need one to
--- make register spills/loads to/from high number slots. I.e. slot numbers that
--- do not fit in a 12bit integer which is used as immediate in the arithmetic
--- operations. Thus, we're marking one additional register (x31) as permanently
--- non-free and call it IP.
---
--- IP can be used as temporary register in all operations. Just be aware that it
--- may be clobbered as soon as you loose direct control over it (i.e. using IP
--- by-passes the register allocation/spilling mechanisms.) It should be fine to
--- use it as temporary register in a MachOp translation as long as you don't
--- rely on its value beyond this limited scope.
---
--- X31 is a caller-saved register. I.e. there are no guarantees about what the
--- callee does with it. That's exactly what we want here.
-
-zeroReg, raReg, spMachReg, ipReg :: Reg
-zeroReg = regSingle 0
-raReg = regSingle 1
--- | Not to be confused with the `CmmReg` `spReg`
-spMachReg = regSingle 2
-ipReg = regSingle 31
-
 operandFromReg :: Reg -> Operand
 operandFromReg = OpReg W64
 
@@ -907,13 +832,6 @@ d29 = operandFromRegNo 61
 d30 = operandFromRegNo 62
 d31 = operandFromRegNo d31RegNo
 
-opRegSExt :: Width -> Reg -> Operand
-opRegSExt W64 r = OpRegExt W64 r ESXTX 0
-opRegSExt W32 r = OpRegExt W32 r ESXTW 0
-opRegSExt W16 r = OpRegExt W16 r ESXTH 0
-opRegSExt W8  r = OpRegExt W8  r ESXTB 0
-opRegSExt w  _r = pprPanic "opRegSExt" (ppr w)
-
 fitsIn12bitImm :: (Num a, Ord a) => a -> Bool
 fitsIn12bitImm off = off >= intMin12bit && off <= intMax12bit
 
@@ -937,7 +855,7 @@ isEncodeableInWidth = isNbitEncodeable . widthInBits
 -- Conditional branch instructions can target labels in a range of +/- 4 KiB.
 -- The assembler can transform this into a J instruction targeting +/- 1MiB.
 -- There are rare cases where this is not enough (e.g. the Happy-generated
--- @Parser.hs at .) We need to manually transfer these into register based jumps
+-- @Parser.hs at .) We need to manually transform these into register based jumps
 -- using @ip@ (register reserved for calculations.) The trick is to invert the
 -- condition, do a far jump in the fall-through case or a short jump when the
 -- (inverted) condition is true.
@@ -964,7 +882,7 @@ makeFarBranches info_env blocks
       | otherwise =
           annotate addr targetAddr orig
       where
-        Just targetAddr = lookupUFM blockAddressMap tgtBid
+        targetAddr = fromJust $ lookupUFM blockAddressMap tgtBid
     makeFar _bid addr orig@(B (TBlock tgtBid))
       | abs (addr - targetAddr) >= nearLimit =
           annotate addr targetAddr $
@@ -972,16 +890,16 @@ makeFarBranches info_env blocks
       | otherwise =
           annotate addr targetAddr orig
       where
-        Just targetAddr = lookupUFM blockAddressMap tgtBid
+        targetAddr = fromJust $ lookupUFM blockAddressMap tgtBid
     makeFar bid addr (ANN desc other) = ANN desc $ makeFar bid addr other
     makeFar _bid _ other = other
 
     -- 262144 (2^20 / 4) instructions are allowed; let's keep some distance, as
     -- we have pseudo-insns that are pretty-printed as multiple instructions,
     -- and it's just not worth the effort to calculate things exactly as linker
-    -- relaxations are applied later (optimizing away our flaws.) The
-    -- conservative guess here is that every instruction does not emit more than
-    -- two in the mean.
+    -- relaxations are applied later (optimizing away our flaws.) The educated
+    -- guess here is that every instruction does not emit more than two in the
+    -- mean.
     nearLimit = 131072 - mapSize info_env * maxRetInfoTableSizeW
 
     blockAddressMap = listToUFM $ zip (map blockId blocks) blockAddressList


=====================================
compiler/GHC/CmmToAsm/RV64/Ppr.hs
=====================================
@@ -1,6 +1,3 @@
-{-# OPTIONS_GHC -fno-warn-orphans #-}
-{-# LANGUAGE CPP #-}
-
 module GHC.CmmToAsm.RV64.Ppr (pprNatCmmDecl, pprInstr) where
 
 import GHC.Prelude hiding (EQ)
@@ -304,116 +301,93 @@ negOp (OpImm (ImmInt i)) = OpImm (ImmInt (negate i))
 negOp (OpImm (ImmInteger i)) = OpImm (ImmInteger (negate i))
 negOp op = pprPanic "RV64.negOp" (text $ show op)
 
--- TODO: Is this used in RISCV?!
-pprExt :: IsLine doc => ExtMode -> doc
-pprExt EUXTB = text "uxtb"
-pprExt EUXTH = text "uxth"
-pprExt EUXTW = text "uxtw"
-pprExt EUXTX = text "uxtx"
-pprExt ESXTB = text "sxtb"
-pprExt ESXTH = text "sxth"
-pprExt ESXTW = text "sxtw"
-pprExt ESXTX = text "sxtx"
-
--- TODO: Is this used in RISCV?!
-pprShift :: IsLine doc => ShiftMode -> doc
-pprShift SLSL = text "lsl"
-pprShift SLSR = text "lsr"
-pprShift SASR = text "asr"
-pprShift SROR = text "ror"
-
 pprOp :: IsLine doc => Platform -> Operand -> doc
 pprOp plat op = case op of
   OpReg w r           -> pprReg w r
-  OpRegExt w r x 0 -> pprReg w r <> comma <+> pprExt x
-  OpRegExt w r x i -> pprReg w r <> comma <+> pprExt x <> comma <+> char '#' <> int i
-  OpRegShift w r s i -> pprReg w r <> comma <+> pprShift s <+> char '#' <> int i
   OpImm im          -> pprIm plat im
-  OpImmShift im s i -> pprIm plat im <> comma <+> pprShift s <+> char '#' <> int i
   OpAddr (AddrRegImm r1 im) -> pprImm plat im <> char '(' <> pprReg W64 r1 <> char ')'
   OpAddr (AddrReg r1)       -> text "0(" <+> pprReg W64 r1 <+> char ')'
 
 pprReg :: forall doc. IsLine doc => Width -> Reg -> doc
 pprReg w r = case r of
-  RegReal    (RealRegSingle i) -> ppr_reg_no w i
+  RegReal    (RealRegSingle i) -> ppr_reg_no i
   -- virtual regs should not show up, but this is helpful for debugging.
   RegVirtual (VirtualRegI u)   -> text "%vI_" <> pprUniqueAlways u
   RegVirtual (VirtualRegF u)   -> text "%vF_" <> pprUniqueAlways u
   RegVirtual (VirtualRegD u)   -> text "%vD_" <> pprUniqueAlways u
-  _                            -> pprPanic "RiscV64.pprReg" (text $ show r)
+  _                            -> pprPanic "RiscV64.pprReg" (text (show r) <+> ppr w)
 
   where
-    -- TODO: Width is only used in error messages, so we could just remove it.
-    ppr_reg_no :: Width -> Int -> doc
+    ppr_reg_no :: Int -> doc
     -- General Purpose Registers
-    ppr_reg_no _ 0 = text "zero"
-    ppr_reg_no _ 1 = text "ra"
-    ppr_reg_no _ 2 = text "sp"
-    ppr_reg_no _ 3 = text "gp"
-    ppr_reg_no _ 4 = text "tp"
-    ppr_reg_no _ 5 = text "t0"
-    ppr_reg_no _ 6 = text "t1"
-    ppr_reg_no _ 7 = text "t2"
-    ppr_reg_no _ 8 = text "s0"
-    ppr_reg_no _ 9 = text "s1"
-    ppr_reg_no _ 10 = text "a0"
-    ppr_reg_no _ 11 = text "a1"
-    ppr_reg_no _ 12 = text "a2"
-    ppr_reg_no _ 13 = text "a3"
-    ppr_reg_no _ 14 = text "a4"
-    ppr_reg_no _ 15 = text "a5"
-    ppr_reg_no _ 16 = text "a6"
-    ppr_reg_no _ 17 = text "a7"
-    ppr_reg_no _ 18 = text "s2"
-    ppr_reg_no _ 19 = text "s3"
-    ppr_reg_no _ 20 = text "s4"
-    ppr_reg_no _ 21 = text "s5"
-    ppr_reg_no _ 22 = text "s6"
-    ppr_reg_no _ 23 = text "s7"
-    ppr_reg_no _ 24 = text "s8"
-    ppr_reg_no _ 25 = text "s9"
-    ppr_reg_no _ 26 = text "s10"
-    ppr_reg_no _ 27 = text "s11"
-    ppr_reg_no _ 28 = text "t3"
-    ppr_reg_no _ 29 = text "t4"
-    ppr_reg_no _ 30 = text "t5"
-    ppr_reg_no _ 31 = text "t6"
+    ppr_reg_no 0 = text "zero"
+    ppr_reg_no 1 = text "ra"
+    ppr_reg_no 2 = text "sp"
+    ppr_reg_no 3 = text "gp"
+    ppr_reg_no 4 = text "tp"
+    ppr_reg_no 5 = text "t0"
+    ppr_reg_no 6 = text "t1"
+    ppr_reg_no 7 = text "t2"
+    ppr_reg_no 8 = text "s0"
+    ppr_reg_no 9 = text "s1"
+    ppr_reg_no 10 = text "a0"
+    ppr_reg_no 11 = text "a1"
+    ppr_reg_no 12 = text "a2"
+    ppr_reg_no 13 = text "a3"
+    ppr_reg_no 14 = text "a4"
+    ppr_reg_no 15 = text "a5"
+    ppr_reg_no 16 = text "a6"
+    ppr_reg_no 17 = text "a7"
+    ppr_reg_no 18 = text "s2"
+    ppr_reg_no 19 = text "s3"
+    ppr_reg_no 20 = text "s4"
+    ppr_reg_no 21 = text "s5"
+    ppr_reg_no 22 = text "s6"
+    ppr_reg_no 23 = text "s7"
+    ppr_reg_no 24 = text "s8"
+    ppr_reg_no 25 = text "s9"
+    ppr_reg_no 26 = text "s10"
+    ppr_reg_no 27 = text "s11"
+    ppr_reg_no 28 = text "t3"
+    ppr_reg_no 29 = text "t4"
+    ppr_reg_no 30 = text "t5"
+    ppr_reg_no 31 = text "t6"
 
     -- Floating Point Registers
-    ppr_reg_no _ 32 = text "ft0"
-    ppr_reg_no _ 33 = text "ft1"
-    ppr_reg_no _ 34 = text "ft2"
-    ppr_reg_no _ 35 = text "ft3"
-    ppr_reg_no _ 36 = text "ft4"
-    ppr_reg_no _ 37 = text "ft5"
-    ppr_reg_no _ 38 = text "ft6"
-    ppr_reg_no _ 39 = text "ft7"
-    ppr_reg_no _ 40 = text "fs0"
-    ppr_reg_no _ 41 = text "fs1"
-    ppr_reg_no _ 42 = text "fa0"
-    ppr_reg_no _ 43 = text "fa1"
-    ppr_reg_no _ 44 = text "fa2"
-    ppr_reg_no _ 45 = text "fa3"
-    ppr_reg_no _ 46 = text "fa4"
-    ppr_reg_no _ 47 = text "fa5"
-    ppr_reg_no _ 48 = text "fa6"
-    ppr_reg_no _ 49 = text "fa7"
-    ppr_reg_no _ 50 = text "fs2"
-    ppr_reg_no _ 51 = text "fs3"
-    ppr_reg_no _ 52 = text "fs4"
-    ppr_reg_no _ 53 = text "fs5"
-    ppr_reg_no _ 54 = text "fs6"
-    ppr_reg_no _ 55 = text "fs7"
-    ppr_reg_no _ 56 = text "fs8"
-    ppr_reg_no _ 57 = text "fs9"
-    ppr_reg_no _ 58 = text "fs10"
-    ppr_reg_no _ 59 = text "fs11"
-    ppr_reg_no _ 60 = text "ft8"
-    ppr_reg_no _ 61 = text "ft9"
-    ppr_reg_no _ 62 = text "ft10"
-    ppr_reg_no _ 63 = text "ft11"
-
-    ppr_reg_no w i
+    ppr_reg_no 32 = text "ft0"
+    ppr_reg_no 33 = text "ft1"
+    ppr_reg_no 34 = text "ft2"
+    ppr_reg_no 35 = text "ft3"
+    ppr_reg_no 36 = text "ft4"
+    ppr_reg_no 37 = text "ft5"
+    ppr_reg_no 38 = text "ft6"
+    ppr_reg_no 39 = text "ft7"
+    ppr_reg_no 40 = text "fs0"
+    ppr_reg_no 41 = text "fs1"
+    ppr_reg_no 42 = text "fa0"
+    ppr_reg_no 43 = text "fa1"
+    ppr_reg_no 44 = text "fa2"
+    ppr_reg_no 45 = text "fa3"
+    ppr_reg_no 46 = text "fa4"
+    ppr_reg_no 47 = text "fa5"
+    ppr_reg_no 48 = text "fa6"
+    ppr_reg_no 49 = text "fa7"
+    ppr_reg_no 50 = text "fs2"
+    ppr_reg_no 51 = text "fs3"
+    ppr_reg_no 52 = text "fs4"
+    ppr_reg_no 53 = text "fs5"
+    ppr_reg_no 54 = text "fs6"
+    ppr_reg_no 55 = text "fs7"
+    ppr_reg_no 56 = text "fs8"
+    ppr_reg_no 57 = text "fs9"
+    ppr_reg_no 58 = text "fs10"
+    ppr_reg_no 59 = text "fs11"
+    ppr_reg_no 60 = text "ft8"
+    ppr_reg_no 61 = text "ft9"
+    ppr_reg_no 62 = text "ft10"
+    ppr_reg_no 63 = text "ft11"
+
+    ppr_reg_no i
          | i < 0 = pprPanic "Unexpected register number (min is 0)" (ppr w <+> int i)
          | i > 63 = pprPanic "Unexpected register number (max is 63)" (ppr w <+> int i)
          -- no support for widths > W64.
@@ -446,7 +420,6 @@ isImmZero (OpImm (ImmDouble 0)) = True
 isImmZero (OpImm (ImmInt 0)) = True
 isImmZero _ = False
 
-
 isLabel :: Target -> Bool
 isLabel (TBlock _) = True
 isLabel (TLabel _) = True
@@ -500,7 +473,6 @@ pprInstr platform instr = case instr of
     | isFloatOp o1 && isFloatOp o2 && isFloatOp o3 -> op3 (text "\tfmul." <> if isSingleOp o1 then text "s" else text "d") o1 o2 o3
     | otherwise -> op3 (text "\tmul") o1 o2 o3
   SMULH o1 o2 o3 -> op3 (text "\tmulh") o1 o2 o3
-  SMULL o1 o2 o3 -> op3 (text "\tsmull") o1 o2 o3
   NEG o1 o2 | isFloatOp o1 && isFloatOp o2 && isSingleOp o2 -> op2 (text "\tfneg.s") o1 o2
   NEG o1 o2 | isFloatOp o1 && isFloatOp o2 && isDoubleOp o2 -> op2 (text "\tfneg.d") o1 o2
   NEG o1 o2 -> op2 (text "\tneg") o1 o2
@@ -509,7 +481,7 @@ pprInstr platform instr = case instr of
     -> op3 (text "\tfdiv." <> if isSingleOp o1 then text "s" else text "d") o1 o2 o3
   DIV o1 o2 o3 -> op3 (text "\tdiv") o1 o2 o3
   REM o1 o2 o3 | isFloatOp o1 && isFloatOp o2 && isFloatOp o3
-    -> panic $ "pprInstr - REM not implemented for floats (yet)"
+    -> panic "pprInstr - REM not implemented for floats (yet)"
   REM o1 o2 o3 -> op3 (text "\trem") o1 o2 o3
   REMU o1 o2 o3 -> op3 (text "\tremu") o1 o2 o3
 
@@ -563,12 +535,8 @@ pprInstr platform instr = case instr of
         -- Surrender! Let the assembler figure out the right expressions with pseudo-op LI.
         -> lines_ [ text "\tli" <+> pprOp platform o1 <> comma <+>  pprOp platform o2 ]
     | otherwise                    -> op3 (text "\taddi") o1 o2 (OpImm (ImmInt 0))
-  MOVK o1 o2    -> op2 (text "\tmovk") o1 o2
-  ORN o1 o2 o3  -> op3 (text "\torn") o1 o2 o3
   ORI o1 o2 o3  -> op3 (text "\tori") o1 o2 o3
   XORI o1 o2 o3 -> op3 (text "\txori") o1 o2 o3
-  ROR o1 o2 o3  -> op3 (text "\tror") o1 o2 o3
-  TST o1 o2     -> op2 (text "\ttst") o1 o2
 
   -- 4. Branch Instructions ----------------------------------------------------
   J t             -> pprInstr platform (B t)
@@ -593,7 +561,7 @@ pprInstr platform instr = case instr of
   BCOND_FAR c l r b t | isLabel t ->
     lines_ [ text "\t" <> pprBcond (negateCond c) <+> pprOp platform l <> comma <+> pprOp platform r <> comma <+> getLabel platform b <> text "_end"
            , text "\tla" <+> pprOp platform ip <> comma <+> getLabel platform t
-           , text "\tjalr" <+> text "x0" <> comma <+> pprOp platform ip <> comma <+> text "0" 
+           , text "\tjalr" <+> text "x0" <> comma <+> pprOp platform ip <> comma <+> text "0"
            ]
 
   BCOND_FAR _ _ _ _ (TReg _)     -> panic "RV64.ppr: No conditional branching to registers!"
@@ -623,10 +591,10 @@ pprInstr platform instr = case instr of
     UGE -> lines_ [ sltuFor l r <+> pprOp platform o <> comma <+> pprOp platform l <> comma <+> pprOp platform r
                   , text "\txori" <+>  pprOp platform o <> comma <+> pprOp platform o <> comma <+> text "1" ]
     UGT -> lines_ [ sltuFor l r <+> pprOp platform o <> comma <+> pprOp platform r <> comma <+> pprOp platform l ]
-    OLT | isFloatOp l && isFloatOp r -> line $ binOp ("\tflt." ++ floatOpPrecision platform l r)
-    OLE | isFloatOp l && isFloatOp r -> line $ binOp ("\tfle." ++ floatOpPrecision platform l r)
-    OGT | isFloatOp l && isFloatOp r -> line $ binOp ("\tfgt." ++ floatOpPrecision platform l r)
-    OGE | isFloatOp l && isFloatOp r -> line $ binOp ("\tfge." ++ floatOpPrecision platform l r)
+    FLT | isFloatOp l && isFloatOp r -> line $ binOp ("\tflt." ++ floatOpPrecision platform l r)
+    FLE | isFloatOp l && isFloatOp r -> line $ binOp ("\tfle." ++ floatOpPrecision platform l r)
+    FGT | isFloatOp l && isFloatOp r -> line $ binOp ("\tfgt." ++ floatOpPrecision platform l r)
+    FGE | isFloatOp l && isFloatOp r -> line $ binOp ("\tfge." ++ floatOpPrecision platform l r)
     x  -> pprPanic "RV64.ppr: unhandled CSET conditional" (text (show x) <+> pprOp platform o <> comma <+> pprOp platform r <> comma <+> pprOp platform l)
     where
       subFor l r | (OpImm _) <- r = text "\taddi" <+> pprOp platform o <> comma <+> pprOp platform l <> comma <+> pprOp platform (negOp r)
@@ -641,14 +609,6 @@ pprInstr platform instr = case instr of
       binOp :: (IsLine doc) => String -> doc
       binOp op = text op <+> pprOp platform o <> comma <+> pprOp platform l <> comma <+> pprOp platform r
 
-  CBZ o (TBlock bid) -> line $ text "\tbeq x0, " <+> pprOp platform o <> comma <+> pprAsmLabel platform (mkLocalBlockLabel (getUnique bid))
-  CBZ o (TLabel lbl) -> line $ text "\tbeq x0, " <+> pprOp platform o <> comma <+> pprAsmLabel platform lbl
-  CBZ _ (TReg _)     -> panic "AArch64.ppr: No conditional (cbz) branching to registers!"
-
-  CBNZ o (TBlock bid) -> line $ text "\tbne x0, " <+> pprOp platform o <> comma <+> pprAsmLabel platform (mkLocalBlockLabel (getUnique bid))
-  CBNZ o (TLabel lbl) -> line $ text "\tbne x0, " <+> pprOp platform o <> comma <+> pprAsmLabel platform lbl
-  CBNZ _ (TReg _)     -> panic "AArch64.ppr: No conditional (cbnz) branching to registers!"
-
   -- 7. Load and Store Instructions --------------------------------------------
   -- NOTE: GHC may do whacky things where it only load the lower part of an
   --       address. Not observing the correct size when loading will lead
@@ -662,7 +622,7 @@ pprInstr platform instr = case instr of
 
   LDR _f o1 (OpImm (ImmIndex lbl off)) ->
     lines_ [ text "\tla" <+> pprOp platform o1 <> comma <+> pprAsmLabel platform lbl
-           , text "\taddi" <+> pprOp platform o1 <> comma <+> pprOp platform o1 <> comma <+> (int off)
+           , text "\taddi" <+> pprOp platform o1 <> comma <+> pprOp platform o1 <> comma <+> int off
            ]
 
   LDR _f o1 (OpImm (ImmCLbl lbl)) ->
@@ -739,19 +699,20 @@ floatOpPrecision _p l r | isFloatOp l && isFloatOp r && isSingleOp l && isSingle
 floatOpPrecision _p l r | isFloatOp l && isFloatOp r && isDoubleOp l && isDoubleOp r = "d" -- double precision
 floatOpPrecision p l r = pprPanic "Cannot determine floating point precission" (text "op1" <+> pprOp p l <+> text "op2" <+> pprOp p r)
 
-pprBcond :: IsLine doc => Cond -> doc
+pprBcond :: (IsLine doc) => Cond -> doc
 pprBcond c = text "b" <> pprCond c
-
-pprCond :: IsLine doc => Cond -> doc
-pprCond c = case c of
-    EQ  -> text "eq"
-    NE  -> text "ne"
-    SLT -> text "lt"
-    SLE -> text "le"
-    SGE -> text "ge"
-    SGT -> text "gt"
-    ULT -> text "ltu"
-    ULE -> text "leu"
-    UGE -> text "geu"
-    UGT -> text "gtu"
-    _   -> panic $ "RV64.ppr: unhandled BCOND conditional: " ++ show c
+  where
+    pprCond :: (IsLine doc) => Cond -> doc
+    pprCond c = case c of
+      EQ -> text "eq"
+      NE -> text "ne"
+      SLT -> text "lt"
+      SLE -> text "le"
+      SGE -> text "ge"
+      SGT -> text "gt"
+      ULT -> text "ltu"
+      ULE -> text "leu"
+      UGE -> text "geu"
+      UGT -> text "gtu"
+      -- BCOND cannot handle floating point comparisons / registers
+      _ -> panic $ "RV64.ppr: unhandled BCOND conditional: " ++ show c


=====================================
compiler/GHC/CmmToAsm/RV64/Regs.hs
=====================================
@@ -16,6 +16,8 @@ import GHC.Utils.Outputable
 import GHC.Utils.Panic
 import GHC.Platform
 
+-- * Registers
+
 -- | First integer register number. @zero@ register.
 x0RegNo :: RegNo
 x0RegNo = 0
@@ -67,6 +69,31 @@ fa7RegNo, d17RegNo :: RegNo
 d17RegNo = 49
 fa7RegNo = d17RegNo
 
+-- Note [The made-up RISCV64 IP register]
+-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+--
+-- RISCV64 has no inter-procedural register in its ABI. However, we need one to
+-- make register spills/loads to/from high number slots. I.e. slot numbers that
+-- do not fit in a 12bit integer which is used as immediate in the arithmetic
+-- operations. Thus, we're marking one additional register (x31) as permanently
+-- non-free and call it IP.
+--
+-- IP can be used as temporary register in all operations. Just be aware that it
+-- may be clobbered as soon as you loose direct control over it (i.e. using IP
+-- by-passes the register allocation/spilling mechanisms.) It should be fine to
+-- use it as temporary register in a MachOp translation as long as you don't
+-- rely on its value beyond this limited scope.
+--
+-- X31 is a caller-saved register. I.e. there are no guarantees about what the
+-- callee does with it. That's exactly what we want here.
+
+zeroReg, raReg, spMachReg, ipReg :: Reg
+zeroReg = regSingle x0RegNo
+raReg = regSingle 1
+-- | Not to be confused with the `CmmReg` `spReg`
+spMachReg = regSingle 2
+ipReg = regSingle ipRegNo
+
 -- | All machine register numbers.
 allMachRegNos :: [RegNo]
 allMachRegNos = intRegs ++ fpRegs
@@ -92,14 +119,18 @@ allGpArgRegs = map regSingle [a0RegNo .. a7RegNo]
 allFpArgRegs :: [Reg]
 allFpArgRegs = map regSingle [fa0RegNo .. fa7RegNo]
 
+-- * Addressing modes
+
 -- | Addressing modes
 data AddrMode
-  = AddrRegImm Reg Imm
-  | AddrReg Reg
+  = -- | A register plus some integer, e.g. @8(sp)@ or @-16(sp)@. The offset
+    -- needs to fit into 12bits.
+    AddrRegImm Reg Imm
+  | -- | A register
+    AddrReg Reg
   deriving (Eq, Show)
 
--- -----------------------------------------------------------------------------
--- Immediates
+-- * Immediates
 
 data Imm
   = ImmInt      Int



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/da053cea8d36f62887083575202a01da261dd492...f6f310eeebbd8b51ec51bf058517953daa2593e3

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/da053cea8d36f62887083575202a01da261dd492...f6f310eeebbd8b51ec51bf058517953daa2593e3
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20240306/cd01c4aa/attachment-0001.html>


More information about the ghc-commits mailing list