[Git][ghc/ghc][wip/angerman/riscv64-ncg] 21 commits: Pretty-print registers by their alias names
Moritz Angermann (@angerman)
gitlab at gitlab.haskell.org
Wed Jun 7 08:18:15 UTC 2023
Moritz Angermann pushed to branch wip/angerman/riscv64-ncg at Glasgow Haskell Compiler / GHC
Commits:
5d7d4217 by Sven Tennie at 2023-05-17T17:33:25+02:00
Pretty-print registers by their alias names
The alias name is easier to memorize and simplifies reasoning
about what's going on.
- - - - -
0484fa82 by Sven Tennie at 2023-05-18T12:10:35+02:00
Fix getAmode: Only signed 12bit immediates
The symptom to find this was a too big immediate in a LW instruction in
test arr020:
Error: illegal operands `lw t0,4016(t0)'
- - - - -
5545140f by Ben Gamari at 2023-05-18T12:56:34+02:00
hadrian: Pass CROSS_EMULATOR to runtests.py
- - - - -
ce78097c by Ben Gamari at 2023-05-18T12:56:34+02:00
testsuite/driver: Add basic support for testing cross-compilers
- - - - -
4f22557e by Sven Tennie at 2023-05-18T18:07:30+02:00
Add OR and ORI instructions
ORR doesn't exist on RISCV. OR with register load is used when the
immediate is too big for ORI (i.e. >12bits.)
- - - - -
b877aa85 by Sven Tennie at 2023-05-18T18:09:52+02:00
Refine TODO comment: Stack frame header size is 2 * 8 byte
The stack frame header should contain two registers: ra and previous fp
- - - - -
c8c7bce6 by Sven Tennie at 2023-05-18T19:36:56+02:00
Fix MOV with immediate
There are three cases:
- Fits in a 12bit immediate slot -> ADDI
- Fits in 32bit -> %hi / %lo piecewise loading
- Else: Let the assembler solve this issue for now, LI
- - - - -
4e60ab12 by Sven Tennie at 2023-05-19T10:24:07+02:00
Add DIV and REM
REM calculates the remainder and replaces the more complex logic copied
from AARCH64.
- - - - -
c3508989 by Sven Tennie at 2023-05-19T11:41:17+02:00
Fix: LDRB -> LB, LDRH -> LH
A simple translation of these instructions from ARM to RISCV.
Add panic-ing pattern matches to fetch the outstanding STR and LDR
cases.
- - - - -
c1413de1 by Sven Tennie at 2023-05-19T18:19:28+02:00
Implement MO_S_Shr and truncateReg
These store and load on the stack to move values in changed widths into
registers.
- - - - -
a08a160d by Sven Tennie at 2023-05-20T11:57:23+02:00
CmmInt 0 should refer to zero register
A constant 0 can always be taken from the zero register.
- - - - -
6c908960 by Sven Tennie at 2023-05-21T17:52:50+02:00
Fix signed shift right
This includes overhauling the sign extension and width truncation logic.
- - - - -
e1bce7ce by Sven Tennie at 2023-05-26T21:14:48+02:00
Replace SXTH & SXTB
Both do not exist on RISCV64. While touching the sign extension code,
also fix the integer calling convention in this sense and update the
sign extension note.
- - - - -
9c83e459 by Sven Tennie at 2023-05-26T22:18:42+02:00
Allow truncation to from smaller to larger Width
This is used as inverse of sign extension to 64bit at many places.
- - - - -
6418dd82 by Sven Tennie at 2023-05-27T09:21:41+02:00
Implement MO_NOT: Replace MVN
MVN does not exist in RV64. Replace it by pseudo-instr not's effective
assembly.
- - - - -
63358eb4 by Sven Tennie at 2023-05-27T10:29:00+02:00
Replace UXTB & UXTH, Fix UDIV
Replace UXTB and UXTB with truncateReg as these instructions do not
exist in RISCV64. UDIV is named DIVU in RISCV64.
- - - - -
b1489bbd by Sven Tennie at 2023-05-27T11:02:58+02:00
Implement XOR
Delete EOR which does not exist on RISCV64.
- - - - -
3ba71edc by Sven Tennie at 2023-05-27T11:14:05+02:00
Rename UDIV -> DIVU
That's how unsigned div is called on RISCV64. This should avoid confusion.
- - - - -
1f737e0a by Sven Tennie at 2023-05-27T11:24:04+02:00
Delete unused EON
It does not exist on RISCV64.
- - - - -
a9c3b295 by Sven Tennie at 2023-05-30T19:55:40+02:00
WIP: MO_S_MulMayOflo
- - - - -
51010f35 by Moritz Angermann at 2023-06-07T08:17:23+00:00
float: first stab at supporting float ins
- - - - -
13 changed files:
- compiler/GHC/CmmToAsm/RV64/CodeGen.hs
- compiler/GHC/CmmToAsm/RV64/Instr.hs
- compiler/GHC/CmmToAsm/RV64/Ppr.hs
- compiler/GHC/CmmToAsm/RV64/Regs.hs
- hadrian/src/Settings/Builders/RunTest.hs
- + tests/compiler/cmm/shift_right.cmm
- + tests/compiler/cmm/zero.cmm
- testsuite/driver/runtests.py
- testsuite/driver/testglobals.py
- testsuite/driver/testlib.py
- + testsuite/tests/codeGen/should_run/MulMayOflo.hs
- + testsuite/tests/codeGen/should_run/MulMayOflo_cmm.cmm
- testsuite/tests/codeGen/should_run/all.T
Changes:
=====================================
compiler/GHC/CmmToAsm/RV64/CodeGen.hs
=====================================
@@ -201,7 +201,7 @@ ann doc instr {- debugIsOn -} = ANN doc instr
-- forced until we actually force them, and without -dppr-debug they should
-- never end up being forced.
annExpr :: CmmExpr -> Instr -> Instr
-annExpr e instr {- debugIsOn -} = ANN (text . show $ e) instr
+annExpr e {- debugIsOn -} = ANN (text . show $ e)
-- annExpr e instr {- debugIsOn -} = ANN (pprExpr genericPlatform e) instr
-- annExpr _ instr = instr
{-# INLINE annExpr #-}
@@ -413,11 +413,11 @@ opRegWidth W16 = W32 -- w
opRegWidth W8 = W32 -- w
opRegWidth w = pprPanic "opRegWidth" (text "Unsupported width" <+> ppr w)
--- Note [Signed arithmetic on AArch64]
+-- Note [Signed arithmetic on RISCV64]
-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
--- Handling signed arithmetic on sub-word-size values on AArch64 is a bit
+-- Handling signed arithmetic on sub-word-size values on RISCV64 is a bit
-- tricky as Cmm's type system does not capture signedness. While 32-bit values
--- are fairly easy to handle due to AArch64's 32-bit instruction variants
+-- are fairly easy to handle due to RISCV64's 32-bit instruction variants
-- (denoted by use of %wN registers), 16- and 8-bit values require quite some
-- care.
--
@@ -447,6 +447,10 @@ opRegWidth w = pprPanic "opRegWidth" (text "Unsupported width" <+> ppr w)
-- requires no extension and no truncate since we can assume that
-- `c` is zero-extended.
--
+-- The "RISC-V Sign Extension Optimizations" LLVM tech talk presentation by
+-- Craig Topper covers possible future improvements
+-- (https://llvm.org/devmtg/2022-11/slides/TechTalk21-RISC-VSignExtensionOptimizations.pdf)
+--
-- TODO:
-- Don't use Width in Operands
-- Instructions should rather carry a RegWidth
@@ -499,7 +503,7 @@ getRegister' config plat expr
CmmLit lit
-> case lit of
- -- TODO handle CmmInt 0 specially, use wzr or xzr.
+ CmmInt 0 w -> pure $ Fixed (intFormat w) zero_reg nilOL
CmmInt i W8 | i >= 0 -> do
return (Any (intFormat W8) (\dst -> unitOL $ annExpr expr (MOV (OpReg W8 dst) (OpImm (ImmInteger (narrowU W8 i))))))
@@ -531,26 +535,16 @@ getRegister' config plat expr
CmmFloat _f W16 -> pprPanic "getRegister' (CmmLit:CmmFloat), no support for halfs" (pdoc plat expr)
CmmFloat f W32 -> do
let word = castFloatToWord32 (fromRational f) :: Word32
- half0 = fromIntegral (fromIntegral word :: Word16)
- half1 = fromIntegral (fromIntegral (word `shiftR` 16) :: Word16)
tmp <- getNewRegNat (intFormat W32)
return (Any (floatFormat W32) (\dst -> toOL [ annExpr expr
- $ MOV (OpReg W32 tmp) (OpImm (ImmInt half0))
- , MOVK (OpReg W32 tmp) (OpImmShift (ImmInt half1) SLSL 16)
+ $ MOV (OpReg W32 tmp) (OpImm (ImmInteger (fromIntegral word)))
, MOV (OpReg W32 dst) (OpReg W32 tmp)
]))
CmmFloat f W64 -> do
let word = castDoubleToWord64 (fromRational f) :: Word64
- half0 = fromIntegral (fromIntegral word :: Word16)
- half1 = fromIntegral (fromIntegral (word `shiftR` 16) :: Word16)
- half2 = fromIntegral (fromIntegral (word `shiftR` 32) :: Word16)
- half3 = fromIntegral (fromIntegral (word `shiftR` 48) :: Word16)
tmp <- getNewRegNat (intFormat W64)
return (Any (floatFormat W64) (\dst -> toOL [ annExpr expr
- $ MOV (OpReg W64 tmp) (OpImm (ImmInt half0))
- , MOVK (OpReg W64 tmp) (OpImmShift (ImmInt half1) SLSL 16)
- , MOVK (OpReg W64 tmp) (OpImmShift (ImmInt half2) SLSL 32)
- , MOVK (OpReg W64 tmp) (OpImmShift (ImmInt half3) SLSL 48)
+ $ MOV (OpReg W64 tmp) (OpImm (ImmInteger (fromIntegral word)))
, MOV (OpReg W64 dst) (OpReg W64 tmp)
]))
CmmFloat _f _w -> pprPanic "getRegister' (CmmLit:CmmFloat), unsupported float lit" (pdoc plat expr)
@@ -610,8 +604,9 @@ getRegister' config plat expr
MO_Not w -> return $ Any (intFormat w) $ \dst ->
let w' = opRegWidth w
in code `snocOL`
- MVN (OpReg w' dst) (OpReg w' reg) `appOL`
- truncateReg w' w dst -- See Note [Signed arithmetic on AArch64]
+ -- pseudo instruction `not` is `xori rd, rs, -1`
+ ann (text "not") (XORI (OpReg w' dst) (OpReg w' reg) (OpImm (ImmInt (-1)))) `appOL`
+ truncateReg w' w dst -- See Note [Signed arithmetic on RISCV64]
MO_S_Neg w -> negate code w reg
MO_F_Neg w -> return $ Any (floatFormat w) (\dst -> code `snocOL` NEG (OpReg w dst) (OpReg w reg))
@@ -655,14 +650,28 @@ getRegister' config plat expr
NEG (OpReg w' dst) (OpReg w' reg') `appOL`
truncateReg w' w dst
- ss_conv from to reg code =
- let w' = opRegWidth (max from to)
- in return $ Any (intFormat to) $ \dst ->
- code `snocOL`
- SBFM (OpReg w' dst) (OpReg w' reg) (OpImm (ImmInt 0)) (toImm (min from to)) `appOL`
- -- At this point an 8- or 16-bit value would be sign-extended
- -- to 32-bits. Truncate back down the final width.
- truncateReg w' to dst
+ ss_conv from to reg code | from == to =
+ pure $ Any (intFormat from) $ \dst ->
+ code `snocOL` (MOV (OpReg from dst) (OpReg from reg))
+ ss_conv from to reg code | from < to = do
+ pure $ Any (intFormat to) $ \dst ->
+ code
+ `appOL` signExtend from to reg dst
+ `appOL` truncateReg from to dst
+ ss_conv from to reg code | from > to =
+ pure $ Any (intFormat to) $ \dst ->
+ code
+ `appOL` toOL
+ [ ann
+ (text "narrow register signed" <+> ppr reg <+> ppr from <> text "->" <> ppr to)
+ (LSL (OpReg to dst) (OpReg from reg) (OpImm (ImmInt shift))),
+ -- signed right shift
+ ASR (OpReg to dst) (OpReg to dst) (OpImm (ImmInt shift))
+ ]
+ `appOL` truncateReg from to dst
+ where
+ -- Why -1? We need to shift out one more bit for the sign.
+ shift = 64 - (widthInBits from - widthInBits to - 1)
-- Dyadic machops:
--
@@ -690,14 +699,14 @@ getRegister' config plat expr
where w' = formatToWidth (cmmTypeFormat (cmmRegType plat reg))
r' = getRegisterReg plat reg
- CmmMachOp (MO_U_Quot w) [x, y] | w == W8 -> do
- (reg_x, _format_x, code_x) <- getSomeReg x
- (reg_y, _format_y, code_y) <- getSomeReg y
- return $ Any (intFormat w) (\dst -> code_x `appOL` code_y `snocOL` annExpr expr (UXTB (OpReg w reg_x) (OpReg w reg_x)) `snocOL` (UXTB (OpReg w reg_y) (OpReg w reg_y)) `snocOL` (UDIV (OpReg w dst) (OpReg w reg_x) (OpReg w reg_y)))
- CmmMachOp (MO_U_Quot w) [x, y] | w == W16 -> do
- (reg_x, _format_x, code_x) <- getSomeReg x
- (reg_y, _format_y, code_y) <- getSomeReg y
- return $ Any (intFormat w) (\dst -> code_x `appOL` code_y `snocOL` annExpr expr (UXTH (OpReg w reg_x) (OpReg w reg_x)) `snocOL` (UXTH (OpReg w reg_y) (OpReg w reg_y)) `snocOL` (UDIV (OpReg w dst) (OpReg w reg_x) (OpReg w reg_y)))
+ CmmMachOp (MO_U_Quot w) [x, y] | w == W8 || w == W16 -> do
+ (reg_x, format_x, code_x) <- getSomeReg x
+ (reg_y, format_y, code_y) <- getSomeReg y
+ return $ Any (intFormat w) (\dst -> code_x `appOL`
+ truncateReg (formatToWidth format_x) w reg_x `appOL`
+ code_y `appOL`
+ truncateReg (formatToWidth format_y) w reg_y `snocOL`
+ annExpr expr (DIVU (OpReg w dst) (OpReg w reg_x) (OpReg w reg_y)))
-- 2. Shifts. x << n, x >> n.
CmmMachOp (MO_Shl w) [x, (CmmLit (CmmInt n _))] | w == W32, 0 <= n, n < 32 -> do
@@ -707,46 +716,33 @@ getRegister' config plat expr
(reg_x, _format_x, code_x) <- getSomeReg x
return $ Any (intFormat w) (\dst -> code_x `snocOL` annExpr expr (LSL (OpReg w dst) (OpReg w reg_x) (OpImm (ImmInteger n))))
- CmmMachOp (MO_S_Shr w) [x, (CmmLit (CmmInt n _))] | w == W8, 0 <= n, n < 8 -> do
- (reg_x, _format_x, code_x) <- getSomeReg x
- return $ Any (intFormat w) (\dst -> code_x `snocOL` annExpr expr (SBFX (OpReg w dst) (OpReg w reg_x) (OpImm (ImmInteger n)) (OpImm (ImmInteger (8-n)))))
- CmmMachOp (MO_S_Shr w) [x, y] | w == W8 -> do
- (reg_x, _format_x, code_x) <- getSomeReg x
- (reg_y, _format_y, code_y) <- getSomeReg y
- return $ Any (intFormat w) (\dst -> code_x `appOL` code_y `snocOL` annExpr expr (SXTB (OpReg w reg_x) (OpReg w reg_x)) `snocOL` (ASR (OpReg w dst) (OpReg w reg_x) (OpReg w reg_y)))
-
- CmmMachOp (MO_S_Shr w) [x, (CmmLit (CmmInt n _))] | w == W16, 0 <= n, n < 16 -> do
- (reg_x, _format_x, code_x) <- getSomeReg x
- return $ Any (intFormat w) (\dst -> code_x `snocOL` annExpr expr (SBFX (OpReg w dst) (OpReg w reg_x) (OpImm (ImmInteger n)) (OpImm (ImmInteger (16-n)))))
- CmmMachOp (MO_S_Shr w) [x, y] | w == W16 -> do
- (reg_x, _format_x, code_x) <- getSomeReg x
+ CmmMachOp (MO_S_Shr w) [x, (CmmLit (CmmInt n _))] | fitsIn12bitImm n -> do
+ (reg_x, format_x, code_x) <- getSomeReg x
+ (reg_x', code_x') <- signExtendReg (formatToWidth format_x) w reg_x
+ return $ Any (intFormat w) (
+ \dst ->
+ code_x `appOL` code_x' `snocOL` annExpr expr (ASR (OpReg w dst) (OpReg w reg_x') (OpImm (ImmInteger n)))
+ )
+ CmmMachOp (MO_S_Shr w) [x, y] -> do
+ (reg_x, format_x, code_x) <- getSomeReg x
(reg_y, _format_y, code_y) <- getSomeReg y
- return $ Any (intFormat w) (\dst -> code_x `appOL` code_y `snocOL` annExpr expr (SXTH (OpReg w reg_x) (OpReg w reg_x)) `snocOL` (ASR (OpReg w dst) (OpReg w reg_x) (OpReg w reg_y)))
-
- CmmMachOp (MO_S_Shr w) [x, (CmmLit (CmmInt n _))] | w == W32, 0 <= n, n < 32 -> do
- (reg_x, _format_x, code_x) <- getSomeReg x
- return $ Any (intFormat w) (\dst -> code_x `snocOL` annExpr expr (ASR (OpReg w dst) (OpReg w reg_x) (OpImm (ImmInteger n))))
-
- CmmMachOp (MO_S_Shr w) [x, (CmmLit (CmmInt n _))] | w == W64, 0 <= n, n < 64 -> do
- (reg_x, _format_x, code_x) <- getSomeReg x
- return $ Any (intFormat w) (\dst -> code_x `snocOL` annExpr expr (ASR (OpReg w dst) (OpReg w reg_x) (OpImm (ImmInteger n))))
-
+ (reg_x', code_x') <- signExtendReg (formatToWidth format_x) w reg_x
+ return $ Any (intFormat w) (
+ \dst ->
+ code_x `appOL` code_x' `appOL` code_y `snocOL` annExpr expr (ASR (OpReg w dst) (OpReg w reg_x') (OpReg w reg_y))
+ )
CmmMachOp (MO_U_Shr w) [x, (CmmLit (CmmInt n _))] | w == W8, 0 <= n, n < 8 -> do
(reg_x, _format_x, code_x) <- getSomeReg x
return $ Any (intFormat w) (\dst -> code_x `snocOL` annExpr expr (UBFX (OpReg w dst) (OpReg w reg_x) (OpImm (ImmInteger n)) (OpImm (ImmInteger (8-n)))))
- CmmMachOp (MO_U_Shr w) [x, y] | w == W8 -> do
- (reg_x, _format_x, code_x) <- getSomeReg x
+ CmmMachOp (MO_U_Shr w) [x, y] | w == W8 || w == W16 -> do
+ (reg_x, format_x, code_x) <- getSomeReg x
(reg_y, _format_y, code_y) <- getSomeReg y
- return $ Any (intFormat w) (\dst -> code_x `appOL` code_y `snocOL` annExpr expr (UXTB (OpReg w reg_x) (OpReg w reg_x)) `snocOL` (ASR (OpReg w dst) (OpReg w reg_x) (OpReg w reg_y)))
+ return $ Any (intFormat w) (\dst -> code_x `appOL` code_y `appOL` truncateReg (formatToWidth format_x) w reg_x `snocOL` annExpr expr (ASR (OpReg w dst) (OpReg w reg_x) (OpReg w reg_y)))
CmmMachOp (MO_U_Shr w) [x, (CmmLit (CmmInt n _))] | w == W16, 0 <= n, n < 16 -> do
(reg_x, _format_x, code_x) <- getSomeReg x
return $ Any (intFormat w) (\dst -> code_x `snocOL` annExpr expr (UBFX (OpReg w dst) (OpReg w reg_x) (OpImm (ImmInteger n)) (OpImm (ImmInteger (16-n)))))
- CmmMachOp (MO_U_Shr w) [x, y] | w == W16 -> do
- (reg_x, _format_x, code_x) <- getSomeReg x
- (reg_y, _format_y, code_y) <- getSomeReg y
- return $ Any (intFormat w) (\dst -> code_x `appOL` code_y `snocOL` annExpr expr (UXTH (OpReg w reg_x) (OpReg w reg_x)) `snocOL` (ASR (OpReg w dst) (OpReg w reg_x) (OpReg w reg_y)))
CmmMachOp (MO_U_Shr w) [x, (CmmLit (CmmInt n _))] | w == W32, 0 <= n, n < 32 -> do
(reg_x, _format_x, code_x) <- getSomeReg x
@@ -757,13 +753,13 @@ getRegister' config plat expr
return $ Any (intFormat w) (\dst -> code_x `snocOL` annExpr expr (LSR (OpReg w dst) (OpReg w reg_x) (OpImm (ImmInteger n))))
-- 3. Logic &&, ||
- CmmMachOp (MO_And w) [(CmmReg reg), CmmLit (CmmInt n _)] | isBitMaskImmediate (fromIntegral n) ->
+ CmmMachOp (MO_And w) [(CmmReg reg), CmmLit (CmmInt n _)] | fitsIn12bitImm n ->
return $ Any (intFormat w) (\d -> unitOL $ annExpr expr (AND (OpReg w d) (OpReg w' r') (OpImm (ImmInteger n))))
where w' = formatToWidth (cmmTypeFormat (cmmRegType plat reg))
r' = getRegisterReg plat reg
- CmmMachOp (MO_Or w) [(CmmReg reg), CmmLit (CmmInt n _)] | isBitMaskImmediate (fromIntegral n) ->
- return $ Any (intFormat w) (\d -> unitOL $ annExpr expr (ORR (OpReg w d) (OpReg w' r') (OpImm (ImmInteger n))))
+ CmmMachOp (MO_Or w) [(CmmReg reg), CmmLit (CmmInt n _)] | fitsIn12bitImm n ->
+ return $ Any (intFormat w) (\d -> unitOL $ annExpr expr (ORI (OpReg w d) (OpReg w' r') (OpImm (ImmInteger n))))
where w' = formatToWidth (cmmTypeFormat (cmmRegType plat reg))
r' = getRegisterReg plat reg
@@ -881,22 +877,14 @@ getRegister' config plat expr
-- Signed multiply/divide
MO_Mul w -> intOp True w (\d x y -> unitOL $ MUL d x y)
MO_S_MulMayOflo w -> do_mul_may_oflo w x y
- MO_S_Quot w -> intOp True w (\d x y -> unitOL $ SDIV d x y)
-
- -- No native rem instruction. So we'll compute the following
- -- Rd <- Rx / Ry | 2 <- 7 / 3 -- SDIV Rd Rx Ry
- -- Rd' <- Rx - Rd * Ry | 1 <- 7 - 2 * 3 -- MSUB Rd' Rd Ry Rx
- -- | '---|----------------|---' |
- -- | '----------------|-------'
- -- '--------------------------'
+ MO_S_Quot w -> intOp True w (\d x y -> unitOL $ DIV d x y)
+
-- Note the swap in Rx and Ry.
- MO_S_Rem w -> withTempIntReg w $ \t ->
- intOp True w (\d x y -> toOL [ SDIV t x y, MSUB d t y x ])
+ MO_S_Rem w -> intOp True w (\d x y -> unitOL $ REM d x y)
-- Unsigned multiply/divide
- MO_U_Quot w -> intOp False w (\d x y -> unitOL $ UDIV d x y)
- MO_U_Rem w -> withTempIntReg w $ \t ->
- intOp False w (\d x y -> toOL [ UDIV t x y, MSUB d t y x ])
+ MO_U_Quot w -> intOp False w (\d x y -> unitOL $ DIVU d x y)
+ MO_U_Rem w -> intOp False w (\d x y -> unitOL $ REM d x y)
-- Signed comparisons -- see Note [CSET]
MO_S_Ge w -> intOp True w (\d x y -> toOL [ CSET d x y SGE ])
@@ -914,7 +902,7 @@ getRegister' config plat expr
MO_F_Add w -> floatOp w (\d x y -> unitOL $ ADD d x y)
MO_F_Sub w -> floatOp w (\d x y -> unitOL $ SUB d x y)
MO_F_Mul w -> floatOp w (\d x y -> unitOL $ MUL d x y)
- MO_F_Quot w -> floatOp w (\d x y -> unitOL $ SDIV d x y)
+ MO_F_Quot w -> floatOp w (\d x y -> unitOL $ DIV d x y)
-- Floating point comparison
MO_F_Eq w -> floatCond w (\d x y -> toOL [ CSET d x y EQ ])
@@ -932,8 +920,8 @@ getRegister' config plat expr
-- Bitwise operations
MO_And w -> bitOp w (\d x y -> unitOL $ AND d x y)
- MO_Or w -> bitOp w (\d x y -> unitOL $ ORR d x y)
- MO_Xor w -> bitOp w (\d x y -> unitOL $ EOR d x y)
+ MO_Or w -> bitOp w (\d x y -> unitOL $ OR d x y)
+ MO_Xor w -> bitOp w (\d x y -> unitOL $ XOR d x y)
MO_Shl w -> intOp False w (\d x y -> unitOL $ LSL d x y)
MO_U_Shr w -> intOp False w (\d x y -> unitOL $ LSR d x y)
MO_S_Shr w -> intOp True w (\d x y -> unitOL $ ASR d x y)
@@ -947,113 +935,187 @@ getRegister' config plat expr
where
isNbitEncodeable :: Int -> Integer -> Bool
isNbitEncodeable n i = let shift = n - 1 in (-1 `shiftL` shift) <= i && i < (1 `shiftL` shift)
- -- FIXME: These are wrong, they are for AArch64, not RISCV! I'm not even sure we need them for RISCV
- isBitMaskImmediate :: Integer -> Bool
- isBitMaskImmediate i = i `elem` [0b0000_0001, 0b0000_0010, 0b0000_0100, 0b0000_1000, 0b0001_0000, 0b0010_0000, 0b0100_0000, 0b1000_0000
- ,0b0000_0011, 0b0000_0110, 0b0000_1100, 0b0001_1000, 0b0011_0000, 0b0110_0000, 0b1100_0000
- ,0b0000_0111, 0b0000_1110, 0b0001_1100, 0b0011_1000, 0b0111_0000, 0b1110_0000
- ,0b0000_1111, 0b0001_1110, 0b0011_1100, 0b0111_1000, 0b1111_0000
- ,0b0001_1111, 0b0011_1110, 0b0111_1100, 0b1111_1000
- ,0b0011_1111, 0b0111_1110, 0b1111_1100
- ,0b0111_1111, 0b1111_1110
- ,0b1111_1111]
-
-- N.B. MUL does not set the overflow flag.
+ -- Return 0 when the operation cannot overflow, /= 0 otherwise
do_mul_may_oflo :: Width -> CmmExpr -> CmmExpr -> NatM Register
+ do_mul_may_oflo w _x _y | w > W64 = pprPanic "Cannot multiply larger than 64bit" (ppr w)
do_mul_may_oflo w at W64 x y = do
- (reg_x, _format_x, code_x) <- getSomeReg x
- (reg_y, _format_y, code_y) <- getSomeReg y
- lo <- getNewRegNat II64
- hi <- getNewRegNat II64
- return $ Any (intFormat w) (\dst ->
- code_x `appOL`
- code_y `snocOL`
- MUL (OpReg w lo) (OpReg w reg_x) (OpReg w reg_y) `snocOL`
- SMULH (OpReg w hi) (OpReg w reg_x) (OpReg w reg_y) `snocOL`
- CSET (OpReg w dst) (OpReg w hi) (OpRegShift w lo SASR 63) NE)
+ (reg_x, format_x, code_x) <- getSomeReg x
+ (reg_y, format_y, code_y) <- getSomeReg y
+ -- TODO: Can't we clobber reg_x and reg_y to save registers?
+ lo <- getNewRegNat II64
+ hi <- getNewRegNat II64
+ -- TODO: Overhaul CSET: 3rd operand isn't needed for SNEZ
+ let nonSense = OpImm (ImmInt 0)
+ pure $
+ Any
+ (intFormat w)
+ ( \dst ->
+ code_x
+ `appOL` signExtend (formatToWidth format_x) W64 reg_x reg_x
+ `appOL` code_y
+ `appOL` signExtend (formatToWidth format_y) W64 reg_x reg_y
+ `appOL` toOL
+ [ annExpr expr (SMULH (OpReg w hi) (OpReg w reg_x) (OpReg w reg_y)),
+ MUL (OpReg w lo) (OpReg w reg_x) (OpReg w reg_y),
+ ASR (OpReg w lo) (OpReg w reg_x) (OpImm (ImmInt (widthInBits W64 - 1))),
+ ann
+ (text "Set flag if result of MULH contains more than sign bits.")
+ (SUB (OpReg w hi) (OpReg w hi) (OpReg w lo)),
+ CSET (OpReg w dst) (OpReg w hi) nonSense NE
+ ]
+ )
do_mul_may_oflo w x y = do
- (reg_x, _format_x, code_x) <- getSomeReg x
- (reg_y, _format_y, code_y) <- getSomeReg y
- let tmp_w = case w of
- W32 -> W64
- W16 -> W32
- W8 -> W32
- _ -> panic "do_mul_may_oflo: impossible"
- -- This will hold the product
- tmp <- getNewRegNat (intFormat tmp_w)
- let ext_mode = case w of
- W32 -> ESXTW
- W16 -> ESXTH
- W8 -> ESXTB
- _ -> panic "do_mul_may_oflo: impossible"
- mul = case w of
- W32 -> SMULL
- W16 -> MUL
- W8 -> MUL
- _ -> panic "do_mul_may_oflo: impossible"
- return $ Any (intFormat w) (\dst ->
- code_x `appOL`
- code_y `snocOL`
- mul (OpReg tmp_w tmp) (OpReg w reg_x) (OpReg w reg_y) `snocOL`
- CSET (OpReg w dst) (OpReg tmp_w tmp) (OpRegExt tmp_w tmp ext_mode 0) NE)
+ (reg_x, format_x, code_x) <- getSomeReg x
+ (reg_y, format_y, code_y) <- getSomeReg y
+ let width_x = formatToWidth format_x
+ width_y = formatToWidth format_y
+ if w > width_x && w > width_y
+ then
+ pure $
+ Any
+ (intFormat w)
+ ( \dst ->
+ -- 8bit * 8bit cannot overflow 16bit
+ -- 16bit * 16bit cannot overflow 32bit
+ -- 32bit * 32bit cannot overflow 64bit
+ unitOL $ annExpr expr (ADD (OpReg w dst) zero (OpImm (ImmInt 0)))
+ )
+ else do
+ let use32BitMul = width_x <= W32 && width_y <= W32
+ nonSense = OpImm (ImmInt 0)
+ if use32BitMul
+ then do
+ narrowedReg <- getNewRegNat II64
+ pure $
+ Any
+ (intFormat w)
+ ( \dst ->
+ code_x
+ `appOL` signExtend (formatToWidth format_x) W32 reg_x reg_x
+ `appOL` code_y
+ `appOL` signExtend (formatToWidth format_y) W32 reg_y reg_y
+ `snocOL` annExpr expr (MUL (OpReg W32 dst) (OpReg W32 reg_x) (OpReg W32 reg_y))
+ `appOL` signExtend W32 w dst narrowedReg
+ `appOL` toOL
+ [ ann
+ (text "Check if the multiplied value fits in the narrowed register")
+ (SUB (OpReg w dst) (OpReg w dst) (OpReg w narrowedReg)),
+ CSET (OpReg w dst) (OpReg w dst) nonSense NE
+ ]
+ )
+ else do
+ -- TODO: Can't we clobber reg_x and reg_y to save registers?
+ lo <- getNewRegNat II64
+ hi <- getNewRegNat II64
+ narrowedLo <- getNewRegNat II64
+
+ -- TODO: Overhaul CSET: 3rd operand isn't needed for SNEZ
+ let nonSense = OpImm (ImmInt 0)
+ pure $
+ Any
+ (intFormat w)
+ ( \dst ->
+ code_x
+ `appOL` signExtend (formatToWidth format_x) W64 reg_x reg_x
+ `appOL` code_y
+ `appOL` signExtend (formatToWidth format_y) W64 reg_x reg_y
+ `appOL` toOL
+ [ annExpr expr (SMULH (OpReg w hi) (OpReg w reg_x) (OpReg w reg_y)),
+ MUL (OpReg w lo) (OpReg w reg_x) (OpReg w reg_y),
+ ASR (OpReg w lo) (OpReg w reg_x) (OpImm (ImmInt (widthInBits W64 - 1))),
+ ann
+ (text "Set flag if result of MULH contains more than sign bits.")
+ (SUB (OpReg w hi) (OpReg w hi) (OpReg w lo)),
+ CSET (OpReg w hi) (OpReg w hi) nonSense NE
+ ]
+ `appOL` signExtend W64 w lo narrowedLo
+ `appOL` toOL
+ [ ann
+ (text "Check if the multiplied value fits in the narrowed register")
+ (SUB (OpReg w narrowedLo) (OpReg w lo) (OpReg w narrowedLo)),
+ CSET (OpReg w narrowedLo) (OpReg w narrowedLo) nonSense NE,
+ ann
+ (text "Combine both overflow flags")
+ (OR (OpReg w dst) (OpReg w narrowedLo) (OpReg w hi))
+ ]
+ )
-- | Instructions to sign-extend the value in the given register from width @w@
-- up to width @w'@.
signExtendReg :: Width -> Width -> Reg -> NatM (Reg, OrdList Instr)
-signExtendReg w w' r =
- case w of
- W64 -> noop
- W32
- | w' == W32 -> noop
- | otherwise -> extend SXTH
- W16 -> extend SXTH
- W8 -> extend SXTB
- _ -> panic "intOp"
+signExtendReg w _w' r | w == W64 = pure (r, nilOL)
+signExtendReg w w' r = do
+ r' <- getNewRegNat (intFormat w')
+ let instrs = signExtend w w' r r'
+ pure (r', instrs)
+
+-- | Sign extends to 64bit, if needed
+--
+-- Source `Reg` @r@ stays untouched, while the conversion happens on destination
+-- `Reg` @r'@.
+signExtend :: Width -> Width -> Reg -> Reg -> OrdList Instr
+signExtend w w' _r _r' | w > w' = pprPanic "This is not a sign extension, but a truncation." $ ppr w <> text "->" <+> ppr w'
+signExtend w w' _r _r' | w > W64 || w' > W64 = pprPanic "Unexpected width (max is 64bit):" $ ppr w <> text "->" <+> ppr w'
+signExtend w w' r r' | w == W64 && w' == W64 && r == r' = nilOL
+signExtend w w' r r' | w == W64 && w' == W64 = unitOL $ MOV (OpReg w' r') (OpReg w r)
+signExtend w w' r r'
+ | w == W32 && w' == W64 =
+ unitOL $
+ ann
+ (text "sign-extend register (SEXT.W)" <+> ppr r <+> ppr w <> text "->" <> ppr w')
+ -- `ADDIW r r 0` is the pseudo-op SEXT.W
+ (ADD (OpReg w' r') (OpReg w r) (OpImm (ImmInt 0)))
+signExtend w w' r r' =
+ toOL
+ [ ann
+ (text "narrow register signed" <+> ppr r <> char ':' <> ppr w <> text "->" <> ppr r <> char ':' <> ppr w')
+ (LSL (OpReg w' r') (OpReg w r) (OpImm (ImmInt shift))),
+ -- signed (arithmetic) right shift
+ ASR (OpReg w' r') (OpReg w' r') (OpImm (ImmInt shift))
+ ]
where
- noop = return (r, nilOL)
- extend instr = do
- r' <- getNewRegNat II64
- return (r', unitOL $ instr (OpReg w' r') (OpReg w' r))
+ shift = 64 - widthInBits w
-- | Instructions to truncate the value in the given register from width @w@
--- down to width @w'@.
+-- to width @w'@.
+--
+-- In other words, it just cuts the width out of the register. N.B.: This
+-- ignores signedness (no sign extension takes place)!
truncateReg :: Width -> Width -> Reg -> OrdList Instr
+truncateReg _w w' _r | w' == W64 = nilOL
+truncateReg _w w' r | w' > W64 = pprPanic "Cannot truncate to width bigger than register size (max is 64bit):" $ text (show r) <> char ':' <+> ppr w'
+truncateReg w _w' r | w > W64 = pprPanic "Unexpected register size (max is 64bit):" $ text (show r) <> char ':' <+> ppr w
truncateReg w w' r =
- case w of
- W64 -> nilOL
- W32
- | w' == W32 -> nilOL
- _ -> unitOL $ UBFM (OpReg w r)
- (OpReg w r)
- (OpImm (ImmInt 0))
- (OpImm $ ImmInt $ widthInBits w' - 1)
+ toOL
+ [ ann
+ (text "truncate register" <+> ppr r <+> ppr w <> text "->" <> ppr w')
+ (LSL (OpReg w' r) (OpReg w r) (OpImm (ImmInt shift))),
+ -- SHL ignores signedness!
+ LSR (OpReg w' r) (OpReg w r) (OpImm (ImmInt shift))
+ ]
+ where
+ shift = 64 - widthInBits w'
-- -----------------------------------------------------------------------------
-- The 'Amode' type: Memory addressing modes passed up the tree.
data Amode = Amode AddrMode InstrBlock
+-- | Provide the value of a `CmmExpr` with an `Amode`
+--
+-- N.B. this function should be used to provide operands to load and store
+-- instructions with signed 12bit wide immediates (S & I types). For other
+-- immediate sizes and formats (e.g. B type uses multiples of 2) this function
+-- would need to be adjusted.
getAmode :: Platform
-> Width -- ^ width of loaded value
-> CmmExpr
-> NatM Amode
-- TODO: Specialize stuff we can destructure here.
--- OPTIMIZATION WARNING: Addressing modes.
--- Addressing options:
--- LDUR/STUR: imm9: -256 - 255
-getAmode platform _ (CmmRegOff reg off) | -256 <= off, off <= 255
- = return $ Amode (AddrRegImm reg' off') nilOL
- where reg' = getRegisterReg platform reg
- off' = ImmInt off
--- LDR/STR: imm12: if reg is 32bit: 0 -- 16380 in multiples of 4
-getAmode platform W32 (CmmRegOff reg off)
- | 0 <= off, off <= 16380, off `mod` 4 == 0
- = return $ Amode (AddrRegImm reg' off') nilOL
- where reg' = getRegisterReg platform reg
- off' = ImmInt off
--- LDR/STR: imm12: if reg is 64bit: 0 -- 32760 in multiples of 8
-getAmode platform W64 (CmmRegOff reg off)
- | 0 <= off, off <= 32760, off `mod` 8 == 0
+-- LDR/STR: Immediate can be represented with 12bits
+getAmode platform w (CmmRegOff reg off)
+ | w <= W64, fitsIn12bitImm off
= return $ Amode (AddrRegImm reg' off') nilOL
where reg' = getRegisterReg platform reg
off' = ImmInt off
@@ -1063,12 +1125,12 @@ getAmode platform W64 (CmmRegOff reg off)
-- E.g. a CmmStoreOff really. This can be translated to `str $expr2, [$expr, #n ]
-- for `n` in range.
getAmode _platform _ (CmmMachOp (MO_Add _w) [expr, CmmLit (CmmInt off _w')])
- | -256 <= off, off <= 255
+ | fitsIn12bitImm off
= do (reg, _format, code) <- getSomeReg expr
return $ Amode (AddrRegImm reg (ImmInteger off)) code
getAmode _platform _ (CmmMachOp (MO_Sub _w) [expr, CmmLit (CmmInt off _w')])
- | -256 <= -off, -off <= 255
+ | fitsIn12bitImm (-off)
= do (reg, _format, code) <- getSomeReg expr
return $ Amode (AddrRegImm reg (ImmInteger (-off))) code
@@ -1167,25 +1229,39 @@ genCondJump bid expr = do
let ubcond w cmp = do
-- compute both sides.
- (reg_x, _format_x, code_x) <- getSomeReg x
- (reg_y, _format_y, code_y) <- getSomeReg y
+ (reg_x, format_x, code_x) <- getSomeReg x
+ (reg_y, format_y, code_y) <- getSomeReg y
let x' = OpReg w reg_x
y' = OpReg w reg_y
return $ case w of
- W8 -> code_x `appOL` code_y `appOL` toOL [ UXTB x' x', UXTB y' y', (annExpr expr (BCOND cmp x' y' (TBlock bid))) ]
- W16 -> code_x `appOL` code_y `appOL` toOL [ UXTH x' x', UXTH y' y', (annExpr expr (BCOND cmp x' y' (TBlock bid))) ]
- _ -> code_x `appOL` code_y `appOL` toOL [ (annExpr expr (BCOND cmp x' y' (TBlock bid))) ]
+ w | w == W8 || w == W16 -> code_x `appOL`
+ truncateReg (formatToWidth format_x) w reg_x `appOL`
+ code_y `appOL`
+ truncateReg (formatToWidth format_y) w reg_y `appOL`
+ code_y `snocOL`
+ annExpr expr (BCOND cmp x' y' (TBlock bid))
+ _ -> code_x `appOL` code_y `snocOL` annExpr expr (BCOND cmp x' y' (TBlock bid))
sbcond w cmp = do
- -- compute both sides.
- (reg_x, _format_x, code_x) <- getSomeReg x
- (reg_y, _format_y, code_y) <- getSomeReg y
- let x' = OpReg w reg_x
- y' = OpReg w reg_y
- return $ case w of
- W8 -> code_x `appOL` code_y `appOL` toOL [ SXTB x' x', SXTB y' y', (annExpr expr (BCOND cmp x' y' (TBlock bid))) ]
- W16 -> code_x `appOL` code_y `appOL` toOL [ SXTH x' x', SXTH y' y', (annExpr expr (BCOND cmp x' y' (TBlock bid))) ]
- _ -> code_x `appOL` code_y `appOL` toOL [ (annExpr expr (BCOND cmp x' y' (TBlock bid))) ]
+ -- compute both sides.
+ (reg_x, format_x, code_x) <- getSomeReg x
+ (reg_y, format_y, code_y) <- getSomeReg y
+ let x' = OpReg w reg_x
+ y' = OpReg w reg_y
+ return $ case w of
+ W8 ->
+ code_x
+ `appOL` signExtend (formatToWidth format_x) W64 reg_x reg_x
+ `appOL` code_y
+ `appOL` signExtend (formatToWidth format_y) W64 reg_y reg_y
+ `appOL` unitOL (annExpr expr (BCOND cmp x' y' (TBlock bid)))
+ W16 ->
+ code_x
+ `appOL` signExtend (formatToWidth format_x) W64 reg_x reg_x
+ `appOL` code_y
+ `appOL` signExtend (formatToWidth format_y) W64 reg_y reg_y
+ `appOL` unitOL (annExpr expr (BCOND cmp x' y' (TBlock bid)))
+ _ -> code_x `appOL` code_y `appOL` unitOL (annExpr expr (BCOND cmp x' y' (TBlock bid)))
fbcond w cmp = do
-- ensure we get float regs
@@ -1617,25 +1693,15 @@ genCCall target dest_regs arg_regs bid = do
--
-- Still have GP regs, and we want to pass an GP argument.
-
passArguments pack (gpReg:gpRegs) fpRegs ((r, format, hint, code_r):args) stackSpace accumRegs accumCode | isIntFormat format = do
platform <- getPlatform
+ -- RISCV64 Integer Calling Convention: "When passed in registers or on the
+ -- stack, integer scalars narrower than XLEN bits are widened according to
+ -- the sign of their type up to 32 bits, then sign-extended to XLEN bits."
let w = formatToWidth format
- mov
- -- Specifically, Darwin/AArch64's ABI requires that the caller
- -- sign-extend arguments which are smaller than 32-bits.
- | w < W32
- , platformCConvNeedsExtension platform
- , SignedHint <- hint
- = case w of
- W8 -> SXTB (OpReg W64 gpReg) (OpReg w r)
- W16 -> SXTH (OpReg W64 gpReg) (OpReg w r)
- _ -> panic "impossible"
- | otherwise
- = MOV (OpReg w gpReg) (OpReg w r)
accumCode' = accumCode `appOL`
- code_r `snocOL`
- ann (text "Pass gp argument: " <> ppr r) mov
+ code_r `appOL`
+ signExtend w W64 r gpReg
passArguments pack gpRegs fpRegs args stackSpace (gpReg:accumRegs) accumCode'
-- Still have FP regs, and we want to pass an FP argument.
=====================================
compiler/GHC/CmmToAsm/RV64/Instr.hs
=====================================
@@ -32,7 +32,7 @@ import Data.Maybe (fromMaybe)
import GHC.Stack
--- | TODO: verify this!
+-- | TODO: Should be `2 * spillSlotSize = 16`
stackFrameHeaderSize :: Platform -> Int
stackFrameHeaderSize _ = 64
@@ -82,37 +82,33 @@ regUsageOfInstr platform instr = case instr of
ADD dst src1 src2 -> usage (regOp src1 ++ regOp src2, regOp dst)
-- CMN l r -> usage (regOp l ++ regOp r, [])
-- CMP l r -> usage (regOp l ++ regOp r, [])
- MSUB dst src1 src2 src3 -> usage (regOp src1 ++ regOp src2 ++ regOp src3, regOp dst)
MUL dst src1 src2 -> usage (regOp src1 ++ regOp src2, regOp dst)
NEG dst src -> usage (regOp src, regOp dst)
SMULH dst src1 src2 -> usage (regOp src1 ++ regOp src2, regOp dst)
SMULL dst src1 src2 -> usage (regOp src1 ++ regOp src2, regOp dst)
- SDIV dst src1 src2 -> usage (regOp src1 ++ regOp src2, regOp dst)
+ DIV dst src1 src2 -> usage (regOp src1 ++ regOp src2, regOp dst)
+ REM dst src1 src2 -> usage (regOp src1 ++ regOp src2, regOp dst)
SUB dst src1 src2 -> usage (regOp src1 ++ regOp src2, regOp dst)
- UDIV dst src1 src2 -> usage (regOp src1 ++ regOp src2, regOp dst)
+ DIVU dst src1 src2 -> usage (regOp src1 ++ regOp src2, regOp dst)
-- 2. Bit Manipulation Instructions ------------------------------------------
SBFM dst src _ _ -> usage (regOp src, regOp dst)
UBFM dst src _ _ -> usage (regOp src, regOp dst)
- SBFX dst src _ _ -> usage (regOp src, regOp dst)
UBFX dst src _ _ -> usage (regOp src, regOp dst)
- SXTB dst src -> usage (regOp src, regOp dst)
- UXTB dst src -> usage (regOp src, regOp dst)
- SXTH dst src -> usage (regOp src, regOp dst)
- UXTH dst src -> usage (regOp src, regOp dst)
-- 3. Logical and Move Instructions ------------------------------------------
AND dst src1 src2 -> usage (regOp src1 ++ regOp src2, regOp dst)
+ OR dst src1 src2 -> usage (regOp src1 ++ regOp src2, regOp dst)
ASR dst src1 src2 -> usage (regOp src1 ++ regOp src2, regOp dst)
BIC dst src1 src2 -> usage (regOp src1 ++ regOp src2, regOp dst)
BICS dst src1 src2 -> usage (regOp src1 ++ regOp src2, regOp dst)
- EON dst src1 src2 -> usage (regOp src1 ++ regOp src2, regOp dst)
- EOR dst src1 src2 -> usage (regOp src1 ++ regOp src2, regOp dst)
+ XOR dst src1 src2 -> usage (regOp src1 ++ regOp src2, regOp dst)
LSL dst src1 src2 -> usage (regOp src1 ++ regOp src2, regOp dst)
LSR dst src1 src2 -> usage (regOp src1 ++ regOp src2, regOp dst)
MOV dst src -> usage (regOp src, regOp dst)
MOVK dst src -> usage (regOp src, regOp dst)
- MVN dst src -> usage (regOp src, regOp dst)
- ORR dst src1 src2 -> usage (regOp src1 ++ regOp src2, regOp dst)
+ -- ORI's third operand is always an immediate
+ ORI dst src1 _ -> usage (regOp src1, regOp dst)
+ XORI dst src1 _ -> usage (regOp src1, regOp dst)
ROR dst src1 src2 -> usage (regOp src1 ++ regOp src2, regOp dst)
TST src1 src2 -> usage (regOp src1 ++ regOp src2, [])
-- 4. Branch Instructions ----------------------------------------------------
@@ -220,39 +216,35 @@ patchRegsOfInstr instr env = case instr of
ADD o1 o2 o3 -> ADD (patchOp o1) (patchOp o2) (patchOp o3)
-- CMN o1 o2 -> CMN (patchOp o1) (patchOp o2)
-- CMP o1 o2 -> CMP (patchOp o1) (patchOp o2)
- MSUB o1 o2 o3 o4 -> MSUB (patchOp o1) (patchOp o2) (patchOp o3) (patchOp o4)
MUL o1 o2 o3 -> MUL (patchOp o1) (patchOp o2) (patchOp o3)
NEG o1 o2 -> NEG (patchOp o1) (patchOp o2)
SMULH o1 o2 o3 -> SMULH (patchOp o1) (patchOp o2) (patchOp o3)
SMULL o1 o2 o3 -> SMULL (patchOp o1) (patchOp o2) (patchOp o3)
- SDIV o1 o2 o3 -> SDIV (patchOp o1) (patchOp o2) (patchOp o3)
+ DIV o1 o2 o3 -> DIV (patchOp o1) (patchOp o2) (patchOp o3)
+ REM o1 o2 o3 -> REM (patchOp o1) (patchOp o2) (patchOp o3)
SUB o1 o2 o3 -> SUB (patchOp o1) (patchOp o2) (patchOp o3)
- UDIV o1 o2 o3 -> UDIV (patchOp o1) (patchOp o2) (patchOp o3)
+ DIVU o1 o2 o3 -> DIVU (patchOp o1) (patchOp o2) (patchOp o3)
-- 2. Bit Manipulation Instructions ----------------------------------------
SBFM o1 o2 o3 o4 -> SBFM (patchOp o1) (patchOp o2) (patchOp o3) (patchOp o4)
UBFM o1 o2 o3 o4 -> UBFM (patchOp o1) (patchOp o2) (patchOp o3) (patchOp o4)
- SBFX o1 o2 o3 o4 -> SBFX (patchOp o1) (patchOp o2) (patchOp o3) (patchOp o4)
UBFX o1 o2 o3 o4 -> UBFX (patchOp o1) (patchOp o2) (patchOp o3) (patchOp o4)
- SXTB o1 o2 -> SXTB (patchOp o1) (patchOp o2)
- UXTB o1 o2 -> UXTB (patchOp o1) (patchOp o2)
- SXTH o1 o2 -> SXTH (patchOp o1) (patchOp o2)
- UXTH o1 o2 -> UXTH (patchOp o1) (patchOp o2)
-- 3. Logical and Move Instructions ----------------------------------------
AND o1 o2 o3 -> AND (patchOp o1) (patchOp o2) (patchOp o3)
+ OR o1 o2 o3 -> OR (patchOp o1) (patchOp o2) (patchOp o3)
-- ANDS o1 o2 o3 -> ANDS (patchOp o1) (patchOp o2) (patchOp o3)
ASR o1 o2 o3 -> ASR (patchOp o1) (patchOp o2) (patchOp o3)
BIC o1 o2 o3 -> BIC (patchOp o1) (patchOp o2) (patchOp o3)
BICS o1 o2 o3 -> BICS (patchOp o1) (patchOp o2) (patchOp o3)
- EON o1 o2 o3 -> EON (patchOp o1) (patchOp o2) (patchOp o3)
- EOR o1 o2 o3 -> EOR (patchOp o1) (patchOp o2) (patchOp o3)
+ XOR o1 o2 o3 -> XOR (patchOp o1) (patchOp o2) (patchOp o3)
LSL o1 o2 o3 -> LSL (patchOp o1) (patchOp o2) (patchOp o3)
LSR o1 o2 o3 -> LSR (patchOp o1) (patchOp o2) (patchOp o3)
MOV o1 o2 -> MOV (patchOp o1) (patchOp o2)
MOVK o1 o2 -> MOVK (patchOp o1) (patchOp o2)
- MVN o1 o2 -> MVN (patchOp o1) (patchOp o2)
- ORR o1 o2 o3 -> ORR (patchOp o1) (patchOp o2) (patchOp o3)
+ -- o3 cannot be a register for ORI (always an immediate)
+ ORI o1 o2 o3 -> ORI (patchOp o1) (patchOp o2) (patchOp o3)
+ XORI o1 o2 o3 -> XORI (patchOp o1) (patchOp o2) (patchOp o3)
ROR o1 o2 o3 -> ROR (patchOp o1) (patchOp o2) (patchOp o3)
TST o1 o2 -> TST (patchOp o1) (patchOp o2)
@@ -560,10 +552,6 @@ data Instr
| DELTA Int
-- 0. Pseudo Instructions --------------------------------------------------
- | SXTB Operand Operand
- | UXTB Operand Operand
- | SXTH Operand Operand
- | UXTH Operand Operand
-- | SXTW Operand Operand
-- | SXTX Operand Operand
| PUSH_STACK_FRAME
@@ -588,7 +576,7 @@ data Instr
-- 2. Memory Load/Store Instructions ---------------------------------------
-- Unlike arm, we don't have register shorthands for size.
- -- We do hover have {L,S}{B,H,W,D}[U] instructions for Load/Store, Byte, Half, Word, Double, (Unsigned).
+ -- We do however have {L,S}{B,H,W,D}[U] instructions for Load/Store, Byte, Half, Word, Double, (Unsigned).
-- Reusing the arm logic with the _format_ specifier will hopefully work.
| STR Format Operand Operand -- str Xn, address-mode // Xn -> *addr
| LDR Format Operand Operand -- ldr Xn, address-mode // Xn <- *addr
@@ -603,7 +591,6 @@ data Instr
-- | CMN Operand Operand -- rd + op2
-- | CMP Operand Operand -- rd - op2
- | MSUB Operand Operand Operand Operand -- rd = ra - rn × rm
| MUL Operand Operand Operand -- rd = rn × rm
@@ -612,23 +599,19 @@ data Instr
-- NOT = XOR -1, x
| NEG Operand Operand -- rd = -op2
- | SDIV Operand Operand Operand -- rd = rn ÷ rm
+ | DIV Operand Operand Operand -- rd = rn ÷ rm
+ | REM Operand Operand Operand -- rd = rn % rm
+ -- TODO: Rename: MULH
| SMULH Operand Operand Operand
| SMULL Operand Operand Operand
- | UDIV Operand Operand Operand -- rd = rn ÷ rm
+ | DIVU Operand Operand Operand -- rd = rn ÷ rm
-- 2. Bit Manipulation Instructions ----------------------------------------
| SBFM Operand Operand Operand Operand -- rd = rn[i,j]
- -- SXTB = SBFM <Wd>, <Wn>, #0, #7
- -- SXTH = SBFM <Wd>, <Wn>, #0, #15
- -- SXTW = SBFM <Wd>, <Wn>, #0, #31
| UBFM Operand Operand Operand Operand -- rd = rn[i,j]
- -- UXTB = UBFM <Wd>, <Wn>, #0, #7
- -- UXTH = UBFM <Wd>, <Wn>, #0, #15
-- Signed/Unsigned bitfield extract
- | SBFX Operand Operand Operand Operand -- rd = rn[i,j]
| UBFX Operand Operand Operand Operand -- rd = rn[i,j]
-- 3. Logical and Move Instructions ----------------------------------------
@@ -637,17 +620,16 @@ data Instr
-- | ASR Operand Operand Operand -- rd = rn ≫ rm or rd = rn ≫ #i, i is 6 bits
| BIC Operand Operand Operand -- rd = rn & ~op2
| BICS Operand Operand Operand -- rd = rn & ~op2
- | EON Operand Operand Operand -- rd = rn ⊕ ~op2
- | EOR Operand Operand Operand -- rd = rn ⊕ op2
+ | XOR Operand Operand Operand -- rd = rn ⊕ op2
-- | LSL Operand Operand Operand -- rd = rn ≪ rm or rd = rn ≪ #i, i is 6 bits
-- | LSR Operand Operand Operand -- rd = rn ≫ rm or rd = rn ≫ #i, i is 6 bits
| MOV Operand Operand -- rd = rn or rd = #i
| MOVK Operand Operand
-- | MOVN Operand Operand
-- | MOVZ Operand Operand
- | MVN Operand Operand -- rd = ~rn
| ORN Operand Operand Operand -- rd = rn | ~op2
- | ORR Operand Operand Operand -- rd = rn | op2
+ | ORI Operand Operand Operand -- rd = rn | op2
+ | XORI Operand Operand Operand -- rd = rn `xor` imm
| ROR Operand Operand Operand -- rd = rn ≫ rm or rd = rn ≫ #i, i is 6 bits
| TST Operand Operand -- rn & op2
-- Load and stores.
@@ -693,41 +675,36 @@ instrCon i =
LDATA{} -> "LDATA"
NEWBLOCK{} -> "NEWBLOCK"
DELTA{} -> "DELTA"
- SXTB{} -> "SXTB"
- UXTB{} -> "UXTB"
- SXTH{} -> "SXTH"
- UXTH{} -> "UXTH"
PUSH_STACK_FRAME{} -> "PUSH_STACK_FRAME"
POP_STACK_FRAME{} -> "POP_STACK_FRAME"
ADD{} -> "ADD"
+ OR{} -> "OR"
-- CMN{} -> "CMN"
-- CMP{} -> "CMP"
- MSUB{} -> "MSUB"
MUL{} -> "MUL"
NEG{} -> "NEG"
- SDIV{} -> "SDIV"
+ DIV{} -> "DIV"
+ REM{} -> "REM"
SMULH{} -> "SMULH"
SMULL{} -> "SMULL"
SUB{} -> "SUB"
- UDIV{} -> "UDIV"
+ DIVU{} -> "DIVU"
SBFM{} -> "SBFM"
UBFM{} -> "UBFM"
- SBFX{} -> "SBFX"
UBFX{} -> "UBFX"
AND{} -> "AND"
-- ANDS{} -> "ANDS"
ASR{} -> "ASR"
BIC{} -> "BIC"
BICS{} -> "BICS"
- EON{} -> "EON"
- EOR{} -> "EOR"
+ XOR{} -> "XOR"
LSL{} -> "LSL"
LSR{} -> "LSR"
MOV{} -> "MOV"
MOVK{} -> "MOVK"
- MVN{} -> "MVN"
ORN{} -> "ORN"
- ORR{} -> "ORR"
+ ORI{} -> "ORI"
+ XORI{} -> "ORI"
ROR{} -> "ROR"
TST{} -> "TST"
STR{} -> "STR"
@@ -778,6 +755,7 @@ data Operand
| OpRegExt Width Reg ExtMode ExtShift -- rm, <ext>[, <shift left>]
| OpRegShift Width Reg ShiftMode RegShift -- rm, <shift>, <0-64>
| OpImm Imm -- immediate value
+ -- TODO: Does OpImmShift exist in RV64?
| OpImmShift Imm ShiftMode RegShift
| OpAddr AddrMode -- memory reference
deriving (Eq, Show)
@@ -787,14 +765,14 @@ opReg :: Width -> Reg -> Operand
opReg = OpReg
ra_reg, sp_reg :: Reg
+zero_reg = RegReal (RealRegSingle 0)
ra_reg = RegReal (RealRegSingle 1)
sp_reg = RegReal (RealRegSingle 2)
-xzr, wzr, sp, ip0 :: Operand
-xzr = OpReg W64 (RegReal (RealRegSingle 0))
-wzr = OpReg W32 (RegReal (RealRegSingle 0))
-ra = OpReg W64 (RegReal (RealRegSingle 1))
-sp = OpReg W64 (RegReal (RealRegSingle 2))
+zero, sp, ip0 :: Operand
+zero = OpReg W64 zero_reg
+ra = OpReg W64 ra_reg
+sp = OpReg W64 sp_reg
gp = OpReg W64 (RegReal (RealRegSingle 3))
tp = OpReg W64 (RegReal (RealRegSingle 4))
fp = OpReg W64 (RegReal (RealRegSingle 8))
@@ -879,16 +857,19 @@ d29 = OpReg W64 (RegReal (RealRegSingle 61))
d30 = OpReg W64 (RegReal (RealRegSingle 62))
d31 = OpReg W64 (RegReal (RealRegSingle 63))
-opRegUExt :: Width -> Reg -> Operand
-opRegUExt W64 r = OpRegExt W64 r EUXTX 0
-opRegUExt W32 r = OpRegExt W32 r EUXTW 0
-opRegUExt W16 r = OpRegExt W16 r EUXTH 0
-opRegUExt W8 r = OpRegExt W8 r EUXTB 0
-opRegUExt w _r = pprPanic "opRegUExt" (ppr w)
-
opRegSExt :: Width -> Reg -> Operand
opRegSExt W64 r = OpRegExt W64 r ESXTX 0
opRegSExt W32 r = OpRegExt W32 r ESXTW 0
opRegSExt W16 r = OpRegExt W16 r ESXTH 0
opRegSExt W8 r = OpRegExt W8 r ESXTB 0
opRegSExt w _r = pprPanic "opRegSExt" (ppr w)
+
+fitsIn12bitImm :: (Num a, Ord a) => a -> Bool
+fitsIn12bitImm off = off >= intMin12bit && off <= intMax12bit
+ where
+ intMin12bit = -2048
+ intMax12bit = 2047
+
+fitsIn32bits :: (Num a, Ord a, Bits a) => a -> Bool
+fitsIn32bits i = (-1 `shiftL` 31) <= i && i <= (1 `shiftL` 31 -1)
+
=====================================
compiler/GHC/CmmToAsm/RV64/Ppr.hs
=====================================
@@ -238,12 +238,8 @@ pprImm p (ImmIndex l i) = pprAsmLabel p l <> char '+' <> int i
pprImm _ (ImmLit s) = ftext s
-- TODO: See pprIm below for why this is a bad idea!
-pprImm _ (ImmFloat f)
- | f == 0 = text "wzr"
- | otherwise = float (fromRational f)
-pprImm _ (ImmDouble d)
- | d == 0 = text "xzr"
- | otherwise = double (fromRational d)
+pprImm _ (ImmFloat f) = float (fromRational f)
+pprImm _ (ImmDouble d) = double (fromRational d)
pprImm p (ImmConstantSum a b) = pprImm p a <> char '+' <> pprImm p b
pprImm p (ImmConstantDiff a b) = pprImm p a <> char '-'
@@ -278,9 +274,9 @@ pprIm platform im = case im of
--
-- We could also just turn them into statics :-/ Which is what the
-- PowerPC backend does.
- ImmFloat f | f == 0 -> text "wzr"
+ ImmFloat f | f == 0 -> text "zero"
ImmFloat f -> char '#' <> float (fromRational f)
- ImmDouble d | d == 0 -> text "xzr"
+ ImmDouble d | d == 0 -> text "zero"
ImmDouble d -> char '#' <> double (fromRational d)
-- =<lbl> pseudo instruction!
ImmCLbl l -> char '=' <> pprAsmLabel platform l
@@ -332,19 +328,79 @@ pprReg w r = case r of
where
ppr_reg_no :: Width -> Int -> doc
+ -- General Purpose Registers
ppr_reg_no _ 0 = text "zero"
+ ppr_reg_no _ 1 = text "ra"
ppr_reg_no _ 2 = text "sp"
+ ppr_reg_no _ 3 = text "gp"
+ ppr_reg_no _ 4 = text "tp"
+ ppr_reg_no _ 5 = text "t0"
+ ppr_reg_no _ 6 = text "t1"
+ ppr_reg_no _ 7 = text "t2"
+ ppr_reg_no _ 8 = text "s0"
+ ppr_reg_no _ 9 = text "s1"
+ ppr_reg_no _ 10 = text "a0"
+ ppr_reg_no _ 11 = text "a1"
+ ppr_reg_no _ 12 = text "a2"
+ ppr_reg_no _ 13 = text "a3"
+ ppr_reg_no _ 14 = text "a4"
+ ppr_reg_no _ 15 = text "a5"
+ ppr_reg_no _ 16 = text "a6"
+ ppr_reg_no _ 17 = text "a7"
+ ppr_reg_no _ 18 = text "s2"
+ ppr_reg_no _ 19 = text "s3"
+ ppr_reg_no _ 20 = text "s4"
+ ppr_reg_no _ 21 = text "s5"
+ ppr_reg_no _ 22 = text "s6"
+ ppr_reg_no _ 23 = text "s7"
+ ppr_reg_no _ 24 = text "s8"
+ ppr_reg_no _ 25 = text "s9"
+ ppr_reg_no _ 26 = text "s10"
+ ppr_reg_no _ 27 = text "s11"
+ ppr_reg_no _ 28 = text "t3"
+ ppr_reg_no _ 29 = text "t4"
+ ppr_reg_no _ 30 = text "t5"
+ ppr_reg_no _ 31 = text "t6"
+
+ -- Floating Point Registers
+ ppr_reg_no _ 32 = text "ft0"
+ ppr_reg_no _ 33 = text "ft1"
+ ppr_reg_no _ 34 = text "ft2"
+ ppr_reg_no _ 35 = text "ft3"
+ ppr_reg_no _ 36 = text "ft4"
+ ppr_reg_no _ 37 = text "ft5"
+ ppr_reg_no _ 38 = text "ft6"
+ ppr_reg_no _ 39 = text "ft7"
+ ppr_reg_no _ 40 = text "fs0"
+ ppr_reg_no _ 41 = text "fs1"
+ ppr_reg_no _ 42 = text "fa0"
+ ppr_reg_no _ 43 = text "fa1"
+ ppr_reg_no _ 44 = text "fa2"
+ ppr_reg_no _ 45 = text "fa3"
+ ppr_reg_no _ 46 = text "fa4"
+ ppr_reg_no _ 47 = text "fa5"
+ ppr_reg_no _ 48 = text "fa6"
+ ppr_reg_no _ 49 = text "fa7"
+ ppr_reg_no _ 50 = text "fs2"
+ ppr_reg_no _ 51 = text "fs3"
+ ppr_reg_no _ 52 = text "fs4"
+ ppr_reg_no _ 53 = text "fs5"
+ ppr_reg_no _ 54 = text "fs6"
+ ppr_reg_no _ 55 = text "fs7"
+ ppr_reg_no _ 56 = text "fs8"
+ ppr_reg_no _ 57 = text "fs9"
+ ppr_reg_no _ 58 = text "fs10"
+ ppr_reg_no _ 59 = text "fs11"
+ ppr_reg_no _ 60 = text "ft8"
+ ppr_reg_no _ 61 = text "ft9"
+ ppr_reg_no _ 62 = text "ft10"
+ ppr_reg_no _ 63 = text "ft11"
ppr_reg_no w i
- | i < 0, w == W32 = text "wzr"
- | i < 0, w == W64 = text "xzr"
- | i < 0 = pprPanic "Invalid Zero Reg" (ppr w <+> int i)
- -- General Purpose Registers
- | i <= 31 = text "x" <> int i
- -- Floating Point Registers
- | i <= 63 = text "f" <> int (i-32)
- -- no support for 'q'uad in GHC's NCG yet.
- | otherwise = text "very naughty powerpc register"
+ | i < 0 = pprPanic "Unexpected register number (min is 0)" (ppr w <+> int i)
+ | i > 63 = pprPanic "Unexpected register number (max is 63)" (ppr w <+> int i)
+ -- no support for widths > W64.
+ | otherwise = pprPanic "Unsupported width in register (max is 64)" (ppr w <+> int i)
isFloatOp :: Operand -> Bool
isFloatOp (OpReg _ (RegReal (RealRegSingle i))) | i > 31 = True
@@ -352,10 +408,25 @@ isFloatOp (OpReg _ (RegVirtual (VirtualRegF _))) = True
isFloatOp (OpReg _ (RegVirtual (VirtualRegD _))) = True
isFloatOp _ = False
+isSingleOp :: Operand -> Bool
+isSingleOp (OpReg W32 _) = True
+isSingleOp _ = False
+
+isDoubleOp :: Operand -> Bool
+isDoubleOp (OpReg W64 _) = True
+isDoubleOp _ = False
+
isImmOp :: Operand -> Bool
isImmOp (OpImm _) = True
isImmOp _ = False
+isImmZero :: Operand -> Bool
+isImmZero (OpImm (ImmFloat 0)) = True
+isImmZero (OpImm (ImmDouble 0)) = True
+isImmZero (OpImm (ImmInt 0)) = True
+isImmZero _ = False
+
+
isLabel :: Target -> Bool
isLabel (TBlock _) = True
isLabel (TLabel _) = True
@@ -395,68 +466,79 @@ pprInstr platform instr = case instr of
-- AArch64 Instruction Set
-- 1. Arithmetic Instructions ------------------------------------------------
ADD o1 o2 o3
- | isFloatOp o1 && isFloatOp o2 && isFloatOp o3 -> op3 (text "\tfadd") o1 o2 o3
+ | isFloatOp o1 && isFloatOp o2 && isFloatOp o3 -> op3 (text "\tfadd." <> if isSingleOp o1 then text "s" else text "d") o1 o2 o3
+ -- This case is used for sign extension: SEXT.W op
+ | OpReg W64 _ <- o1 , OpReg W32 _ <- o2, isImmOp o3 -> op3 (text "\taddiw") o1 o2 o3
| otherwise -> op3 (text "\tadd") o1 o2 o3
-- CMN o1 o2 -> op2 (text "\tcmn") o1 o2
-- CMP o1 o2
-- | isFloatOp o1 && isFloatOp o2 -> op2 (text "\tfcmp") o1 o2
-- | otherwise -> op2 (text "\tcmp") o1 o2
- MSUB o1 o2 o3 o4 -> op4 (text "\tmsub") o1 o2 o3 o4
MUL o1 o2 o3
- | isFloatOp o1 && isFloatOp o2 && isFloatOp o3 -> op3 (text "\tfmul") o1 o2 o3
+ | isFloatOp o1 && isFloatOp o2 && isFloatOp o3 -> op3 (text "\tfmul." <> if isSingleOp o1 then text "s" else text "d") o1 o2 o3
| otherwise -> op3 (text "\tmul") o1 o2 o3
- SMULH o1 o2 o3 -> op3 (text "\tsmulh") o1 o2 o3
+ SMULH o1 o2 o3 -> op3 (text "\tmulh") o1 o2 o3
SMULL o1 o2 o3 -> op3 (text "\tsmull") o1 o2 o3
NEG o1 o2
| isFloatOp o1 && isFloatOp o2 -> op2 (text "\tfneg") o1 o2
| otherwise -> op2 (text "\tneg") o1 o2
- SDIV o1 o2 o3 | isFloatOp o1 && isFloatOp o2 && isFloatOp o3
- -> op3 (text "\tfdiv") o1 o2 o3
- SDIV o1 o2 o3 -> op3 (text "\tsdiv") o1 o2 o3
+ DIV o1 o2 o3 | isFloatOp o1 && isFloatOp o2 && isFloatOp o3
+ -- TODO: This must (likely) be refined regarding width
+ -> op3 (text "\tfdiv." <> if isSingleOp o1 then text "s" else text "d") o1 o2 o3
+ DIV o1 o2 o3 -> op3 (text "\tdiv") o1 o2 o3
+ REM o1 o2 o3 | isFloatOp o1 && isFloatOp o2 && isFloatOp o3
+ -> panic $ "pprInstr - REM not implemented for floats (yet)"
+ REM o1 o2 o3 -> op3 (text "\trem") o1 o2 o3
SUB o1 o2 o3
- | isFloatOp o1 && isFloatOp o2 && isFloatOp o3 -> op3 (text "\tfsub") o1 o2 o3
+ | isFloatOp o1 && isFloatOp o2 && isFloatOp o3 -> op3 (text "\tfsub." <> if isSingleOp o1 then text "s" else text "d") o1 o2 o3
| isImmOp o3 -> op3 (text "\taddi") o1 o2 (negOp o3)
| otherwise -> op3 (text "\tsub") o1 o2 o3
- UDIV o1 o2 o3 -> op3 (text "\tudiv") o1 o2 o3
+ DIVU o1 o2 o3 -> op3 (text "\tdivu") o1 o2 o3
-- 2. Bit Manipulation Instructions ------------------------------------------
SBFM o1 o2 o3 o4 -> op4 (text "\tsbfm") o1 o2 o3 o4
UBFM o1 o2 o3 o4 -> op4 (text "\tubfm") o1 o2 o3 o4
-- signed and unsigned bitfield extract
- SBFX o1 o2 o3 o4 -> op4 (text "\tsbfx") o1 o2 o3 o4
UBFX o1 o2 o3 o4 -> op4 (text "\tubfx") o1 o2 o3 o4
- SXTB o1 o2 -> op2 (text "\tsxtb") o1 o2
- UXTB o1 o2 -> op2 (text "\tuxtb") o1 o2
- SXTH o1 o2 -> op2 (text "\tsxth") o1 o2
- UXTH o1 o2 -> op2 (text "\tuxth") o1 o2
-- 3. Logical and Move Instructions ------------------------------------------
AND o1 o2 o3 -> op3 (text "\tand") o1 o2 o3
+ OR o1 o2 o3 -> op3 (text "\tor") o1 o2 o3
-- ANDS o1 o2 o3 -> op3 (text "\tands") o1 o2 o3
+ ASR o1 o2 o3 | isImmOp o3 -> op3 (text "\tsrai") o1 o2 o3
ASR o1 o2 o3 -> op3 (text "\tsra") o1 o2 o3
BIC o1 o2 o3 -> op3 (text "\tbic") o1 o2 o3
BICS o1 o2 o3 -> op3 (text "\tbics") o1 o2 o3
- EON o1 o2 o3 -> op3 (text "\teon") o1 o2 o3
- EOR o1 o2 o3 -> op3 (text "\teor") o1 o2 o3
+ XOR o1 o2 o3 -> op3 (text "\txor") o1 o2 o3
LSL o1 o2 o3 -> op3 (text "\tsll") o1 o2 o3
LSR o1 o2 o3 -> op3 (text "\tsrl") o1 o2 o3
MOV o1 o2
- | isFloatOp o1 || isFloatOp o2 -> op2 (text "\tfmov") o1 o2
+ | isFloatOp o1 && isFloatOp o2 && isDoubleOp o2 -> op2 (text "\tfmv.d") o1 o2 -- fmv.d rd, rs is pseudo op fsgnj.d rd, rs, rs
+ | isFloatOp o1 && isFloatOp o2 && isSingleOp o2 -> op2 (text "\tfmv.s") o1 o2 -- fmv.s rd, rs is pseudo op fsgnj.s rd, rs, rs
+ | isFloatOp o1 && isImmZero o2 && isDoubleOp o1 -> op2 (text "\tfcvt.d.w") o1 zero
+ | isFloatOp o1 && isImmZero o2 && isSingleOp o1 -> op2 (text "\tfcvt.s.w") o1 zero
+ | isFloatOp o1 && not (isFloatOp o2) && isSingleOp o1 -> op2 (text "\tfmv.w.x") o1 o2
+ | isFloatOp o1 && not (isFloatOp o2) && isDoubleOp o1 -> op2 (text "\tfmv.d.x") o1 o2
+ | not (isFloatOp o1) && isFloatOp o2 && isSingleOp o2 -> op2 (text "\tfmv.x.w") o1 o2
+ | not (isFloatOp o1) && isFloatOp o2 && isDoubleOp o2 -> op2 (text "\tfmv.x.d") o1 o2
| isImmOp o2
, (OpImm (ImmInteger i)) <- o2
- , (-1 `shiftL` 11) <= i
- , i <= (1 `shiftL` 11 - 1) -> lines_ [ text "\taddi" <+> pprOp platform o1 <> comma <+> pprOp platform x0 <> comma <+> pprOp platform o2 ]
+ , fitsIn12bitImm i
+ -> lines_ [ text "\taddi" <+> pprOp platform o1 <> comma <+> pprOp platform x0 <> comma <+> pprOp platform o2 ]
| isImmOp o2
, (OpImm (ImmInteger i)) <- o2
- , (-1 `shiftL` 31) <= i
- , i <= (1 `shiftL` 31 -1) -> lines_ [ text "\tlui" <+> pprOp platform o1 <> comma <+> text "%hi(" <> pprOp platform o2 <> text ")"
+ , fitsIn32bits i
+ -> lines_ [ text "\tlui" <+> pprOp platform o1 <> comma <+> text "%hi(" <> pprOp platform o2 <> text ")"
, text "\taddi" <+> pprOp platform o1 <> comma <+> pprOp platform o1 <> comma <+> text "%lo(" <> pprOp platform o2 <> text ")" ]
+ | isImmOp o2
+ -- Surrender! Let the assembler figure out the right expressions with pseudo-op LI.
+ -> lines_ [ text "\tli" <+> pprOp platform o1 <> comma <+> pprOp platform o2 ]
| otherwise -> op3 (text "\taddi") o1 o2 (OpImm (ImmInt 0))
MOVK o1 o2 -> op2 (text "\tmovk") o1 o2
- MVN o1 o2 -> op2 (text "\tmvn") o1 o2
ORN o1 o2 o3 -> op3 (text "\torn") o1 o2 o3
- ORR o1 o2 o3 -> op3 (text "\torr") o1 o2 o3
+ ORI o1 o2 o3 -> op3 (text "\tori") o1 o2 o3
+ XORI o1 o2 o3 -> op3 (text "\txori") o1 o2 o3
ROR o1 o2 o3 -> op3 (text "\tror") o1 o2 o3
TST o1 o2 -> op2 (text "\ttst") o1 o2
@@ -529,6 +611,10 @@ pprInstr platform instr = case instr of
STR II16 o1 o2 -> op2 (text "\tsh") o1 o2
STR II32 o1 o2 -> op2 (text "\tsw") o1 o2
STR II64 o1 o2 -> op2 (text "\tsd") o1 o2
+ STR FF32 o1 o2 -> op2 (text "\tfsw") o1 o2
+ STR FF64 o1 o2 -> op2 (text "\tfsd") o1 o2
+ STR f o1 o2 -> pprPanic "RV64.pprInstr - STR not implemented for ... "
+ (text "STR" <+> (text.show) f <+> pprOp platform o1 <+> pprOp platform o2)
LDR _f o1 (OpImm (ImmIndex lbl off)) ->
lines_ [ text "\tla" <+> pprOp platform o1 <> comma <+> pprAsmLabel platform lbl
@@ -546,15 +632,17 @@ pprInstr platform instr = case instr of
-- op_add o1 (text "%pcrel_lo(" <> pprAsmLabel platform lbl <> text ")")
line $ text "\tla" <+> pprOp platform o1 <> comma <+> pprAsmLabel platform lbl
- LDR _f o1@(OpReg W8 (RegReal (RealRegSingle i))) o2 | i < 32 ->
- op2 (text "\tldrb") o1 o2
- LDR _f o1@(OpReg W16 (RegReal (RealRegSingle i))) o2 | i < 32 ->
- op2 (text "\tldrh") o1 o2
+ LDR _f o1@(OpReg W8 reg) o2 | isIntRealReg reg ->
+ op2 (text "\tlb") o1 o2
+ LDR _f o1@(OpReg W16 reg) o2 | isIntRealReg reg ->
+ op2 (text "\tlh") o1 o2
LDR II8 o1 o2 -> op2 (text "\tlb") o1 o2
LDR II16 o1 o2 -> op2 (text "\tlh") o1 o2
LDR II32 o1 o2 -> op2 (text "\tlw") o1 o2
LDR II64 o1 o2 -> op2 (text "\tld") o1 o2
+ LDR FF32 o1 o2 -> op2 (text "\tflw") o1 o2
+ LDR FF64 o1 o2 -> op2 (text "\tfld") o1 o2
-- LDAR _f o1 o2 -> op2 (text "\tldar") o1 o2
-- STP _f o1 o2 o3 -> op3 (text "\tstp") o1 o2 o3
@@ -567,6 +655,7 @@ pprInstr platform instr = case instr of
SCVTF o1 o2 -> op2 (text "\tscvtf") o1 o2
FCVTZS o1 o2 -> op2 (text "\tfcvtzs") o1 o2
FABS o1 o2 -> op2 (text "\tfabs") o1 o2
+ instr -> panic $ "RV64.pprInstr - Unknown instruction: " ++ (instrCon instr)
where op2 op o1 o2 = line $ op <+> pprOp platform o1 <> comma <+> pprOp platform o2
op3 op o1 o2 o3 = line $ op <+> pprOp platform o1 <> comma <+> pprOp platform o2 <> comma <+> pprOp platform o3
op4 op o1 o2 o3 o4 = line $ op <+> pprOp platform o1 <> comma <+> pprOp platform o2 <> comma <+> pprOp platform o3 <> comma <+> pprOp platform o4
=====================================
compiler/GHC/CmmToAsm/RV64/Regs.hs
=====================================
@@ -145,6 +145,10 @@ classOfRealReg (RealRegSingle i)
| i < 32 = RcInteger
| otherwise = RcDouble
+isIntRealReg :: Reg -> Bool
+isIntRealReg (RegReal r) = classOfRealReg r == RcInteger
+isIntRealReg _ = False
+
regDotColor :: RealReg -> SDoc
regDotColor reg
= case classOfRealReg reg of
=====================================
hadrian/src/Settings/Builders/RunTest.hs
=====================================
@@ -211,6 +211,7 @@ runTestBuilderArgs = builder Testsuite ? do
(testEnv, testMetricsFile) <- expr . liftIO $
(,) <$> lookupEnv "TEST_ENV" <*> lookupEnv "METRICS_FILE"
perfBaseline <- expr . liftIO $ lookupEnv "PERF_BASELINE_COMMIT"
+ targetWrapper <- expr . liftIO $ lookupEnv "CROSS_EMULATOR"
threads <- shakeThreads <$> expr getShakeOptions
top <- expr $ topDirectory
@@ -280,6 +281,7 @@ runTestBuilderArgs = builder Testsuite ? do
, case perfBaseline of
Just commit | not (null commit) -> arg ("--perf-baseline=" ++ commit)
_ -> mempty
+ , emitWhenSet targetWrapper $ \cmd -> arg ("--target-wrapper=" ++ cmd)
, emitWhenSet testEnv $ \env -> arg ("--test-env=" ++ env)
, emitWhenSet testMetricsFile $ \file -> arg ("--metrics-file=" ++ file)
, getTestArgs -- User-provided arguments from command line.
=====================================
tests/compiler/cmm/shift_right.cmm
=====================================
@@ -0,0 +1,24 @@
+// RUN: "$HC" -debug -dppr-debug -cpp -dcmm-lint -keep-s-file -O0 -c "$1" && cat "${1%%.*}.s" | FileCheck "$1" -check-prefix=CHECK-RV64
+// RUN: "$CC" "${1%%.*}.o" -o "${1%%.*}.exe"
+// RUN: "$EXEC" "${1%%.cmm}.exe"
+
+#include "Cmm.h"
+#include "Types.h"
+
+main() {
+ I64 buffer;
+ I32 a, b, c, d;
+
+ I64 arr;
+ (arr) = foreign "C" malloc(1024);
+ bits64[arr] = 2;
+
+ a = I32[arr];
+ b = %mul(a, 32 :: I32);
+ c = %neg(b);
+ d = %shra(c, 4::I64);
+
+ foreign "C" printf("a: %hd b: %hd c: %hd d: %hd", a, b, c, d);
+
+ foreign "C" exit(d == -4 :: I32);
+}
=====================================
tests/compiler/cmm/zero.cmm
=====================================
@@ -0,0 +1,14 @@
+// RUN: "$HC" -cpp -dcmm-lint -keep-s-file -c "$1" && cat "${1%%.*}.s" | FileCheck "$1" -check-prefix=CHECK-RV64
+// RUN: "$CC" "${1%%.*}.o" -o "${1%%.*}.exe"
+// RUN: "$EXEC" "${1%%.cmm}.exe"
+
+#include "Cmm.h"
+#include "Types.h"
+
+main(){
+ I64 zero;
+ // Should refer to the zero register
+ // CHECK-RV64: addi t0, zero, 0
+ zero = 0;
+ foreign "C" exit(zero);
+}
=====================================
testsuite/driver/runtests.py
=====================================
@@ -71,6 +71,7 @@ parser.add_argument("--config", action='append', help="config field")
parser.add_argument("--rootdir", action='append', help="root of tree containing tests (default: .)")
parser.add_argument("--metrics-file", help="file in which to save (append) the performance test metrics. If omitted, git notes will be used.")
parser.add_argument("--summary-file", help="file in which to save the (human-readable) summary")
+parser.add_argument("--target-wrapper", help="wrapper executable to use when executing binaries compiled for the target")
parser.add_argument("--no-print-summary", action="store_true", help="should we print the summary?")
parser.add_argument("--only", action="append", help="just this test (can be give multiple --only= flags)")
parser.add_argument("--way", action="append", help="just this way")
@@ -118,6 +119,7 @@ hasMetricsFile = config.metrics_file is not None
config.summary_file = args.summary_file
config.no_print_summary = args.no_print_summary
config.baseline_commit = args.perf_baseline
+config.target_wrapper = args.target_wrapper
if args.top:
config.top = args.top
=====================================
testsuite/driver/testglobals.py
=====================================
@@ -175,6 +175,11 @@ class TestConfig:
# threads
self.threads = 1
+ # An optional executable used to wrap target code execution
+ # When set tests which aren't marked with TestConfig.cross_okay
+ # are skipped.
+ self.target_wrapper = None
+
# tests which should be considered to be broken during this testsuite
# run.
self.broken_tests = set() # type: Set[TestName]
@@ -445,6 +450,12 @@ class TestOptions:
# Should we copy the files of symlink the files for the test?
self.copy_files = False
+ # Should the test be run in a cross-compiled tree?
+ # None: infer from test function
+ # True: run when --target-wrapper is set
+ # False: do not run in cross-compiled trees
+ self.cross_okay = None # type: Optional[bool]
+
# The extra hadrian dependencies we need for this particular test
self.hadrian_deps = set(["test:ghc"]) # type: Set[str]
=====================================
testsuite/driver/testlib.py
=====================================
@@ -90,6 +90,10 @@ def setLocalTestOpts(opts: TestOptions) -> None:
global testopts_local
testopts_local.x = opts
+def isCross() -> bool:
+ """ Are we testing a cross-compiler? """
+ return config.target_wrapper is not None
+
def isCompilerStatsTest() -> bool:
opts = getTestOpts()
return bool(opts.is_compiler_stats_test)
@@ -255,7 +259,7 @@ def req_dynamic_hs( name, opts ):
opts.expect = 'fail'
def req_interp( name, opts ):
- if not config.have_interp:
+ if not config.have_interp or isCross():
opts.expect = 'fail'
# JS backend doesn't provide an interpreter yet
js_skip(name, opts)
@@ -1080,14 +1084,21 @@ def test_common_work(name: TestName, opts,
all_ways = [WayName('ghci')]
else:
all_ways = []
+ if isCross():
+ opts.cross_okay = False
elif func in [makefile_test, run_command]:
# makefile tests aren't necessarily runtime or compile-time
# specific. Assume we can run them in all ways. See #16042 for what
# happened previously.
all_ways = config.compile_ways + config.run_ways
+ if isCross():
+ opts.cross_okay = False
else:
all_ways = [WayName('normal')]
+ if isCross() and opts.cross_okay is False:
+ opts.skip = True
+
# A test itself can request extra ways by setting opts.extra_ways
all_ways = list(OrderedDict.fromkeys(all_ways + [way for way in opts.extra_ways if way not in all_ways]))
@@ -1813,7 +1824,10 @@ def simple_run(name: TestName, way: WayName, prog: str, extra_run_opts: str) ->
stats_args = ''
# Put extra_run_opts last: extra_run_opts('+RTS foo') should work.
- cmd = ' '.join([prog, stats_args, my_rts_flags, extra_run_opts])
+ args = [prog, stats_args, my_rts_flags, extra_run_opts]
+ if config.target_wrapper is not None:
+ args = [config.target_wrapper] + args
+ cmd = ' '.join(args)
if opts.cmd_wrapper is not None:
cmd = opts.cmd_wrapper(cmd)
=====================================
testsuite/tests/codeGen/should_run/MulMayOflo.hs
=====================================
@@ -0,0 +1,9 @@
+{-# LANGUAGE GHCForeignImportPrim, MagicHash, UnliftedFFITypes #-}
+
+module Main where
+import GHC.Exts
+
+foreign import prim "runCmmzh" runCmm# :: Int# -> Int#
+
+main :: IO ()
+main = (print . show) (I# (runCmm# 0#))
=====================================
testsuite/tests/codeGen/should_run/MulMayOflo_cmm.cmm
=====================================
@@ -0,0 +1,4 @@
+runCmmzh() {
+
+ return(0);
+}
=====================================
testsuite/tests/codeGen/should_run/all.T
=====================================
@@ -229,3 +229,8 @@ test('T20640b', normal, compile_and_run, [''])
test('T22296',[only_ways(llvm_ways)
,unless(arch('x86_64'), skip)],compile_and_run,[''])
test('T22798', normal, compile_and_run, ['-fregs-graph'])
+
+test('MulMayOflo',
+ [ omit_ways(['ghci']), js_skip],
+ multi_compile_and_run,
+ ['MulMayOflo', [('MulMayOflo_cmm.cmm', '')], ''])
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/c5024bcbe41c5cab6429185fa4d65c67f87b8aa1...51010f35d6df756abbb57bf0ee1ceaaaa205a2bf
--
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/c5024bcbe41c5cab6429185fa4d65c67f87b8aa1...51010f35d6df756abbb57bf0ee1ceaaaa205a2bf
You're receiving this email because of your account on gitlab.haskell.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20230607/9694b483/attachment-0001.html>
More information about the ghc-commits
mailing list