[Git][ghc/ghc][wip/9.2.6-backports] 11 commits: ghc-the-library: Retain cafs in both static in dynamic builds.

Zubin (@wz1000) gitlab at gitlab.haskell.org
Thu Feb 2 20:32:29 UTC 2023



Zubin pushed to branch wip/9.2.6-backports at Glasgow Haskell Compiler / GHC


Commits:
a02bfe28 by Andreas Klebinger at 2023-02-03T02:01:23+05:30
ghc-the-library: Retain cafs in both static in dynamic builds.

We use keepCAFsForGHCi.c to force -fkeep-cafs behaviour by using a
__attribute__((constructor)) function.

This broke for static builds where the linker discarded the object file
since it was not reverenced from any exported code. We fix this by
asserting that the flag is enabled using a function in the same module
as the constructor. Which causes the object file to be retained by the
linker, which in turn causes the constructor the be run in static builds.

This changes nothing for dynamic builds using the ghc library. But causes
static to also retain CAFs (as we expect them to).

Fixes #22417.

-------------------------
Metric Decrease:
    T21839r
-------------------------

(cherry picked from commit 08ba87200ff068aa37cac082e61ee7e2d534daf5)

- - - - -
883068ff by Matthew Pickering at 2023-02-03T02:01:23+05:30
T10955: Set DYLD_LIBRARY_PATH for darwin

The correct path to direct the dynamic linker on darwin is
DYLD_LIBRARY_PATH rather than LD_LIBRARY_PATH. On recent versions of OSX
using LD_LIBRARY_PATH seems to have stopped working.

For more reading see:

https://stackoverflow.com/questions/3146274/is-it-ok-to-use-dyld-library-path-on-mac-os-x-and-whats-the-dynamic-library-s
(cherry picked from commit a960ca817d6ad0109ea6edda50da3902cc538e86)

- - - - -
898cf5ba by Matthew Pickering at 2023-02-03T02:01:23+05:30
Skip T18623 on darwin (to add to the long list of OSs)

On recent versions of OSX, running `ulimit -v` results in

```
ulimit: setrlimit failed: invalid argument
```

Time is too short to work out what random stuff Apple has been doing
with ulimit, so just skip the test like we do for other platforms.

(cherry picked from commit 734847108420cf826a807c30ad54651659cf3a08)

- - - - -
ec15304a by Matthew Pickering at 2023-02-03T02:01:23+05:30
Pass -Wl,-no_fixup_chains to ld64 when appropiate

Recent versions of MacOS use a version of ld where `-fixup_chains` is on by default.
This is incompatible with our usage of `-undefined dynamic_lookup`. Therefore we
explicitly disable `fixup-chains` by passing `-no_fixup_chains` to the linker on
darwin. This results in a warning of the form:

ld: warning: -undefined dynamic_lookup may not work with chained fixups

The manual explains the incompatible nature of these two flags:

     -undefined treatment
             Specifies how undefined symbols are to be treated. Options are: error, warning,
             suppress, or dynamic_lookup.  The default is error. Note: dynamic_lookup that
             depends on lazy binding will not work with chained fixups.

A relevant ticket is #22429

Here are also a few other links which are relevant to the issue:

Official comment: https://developer.apple.com/forums/thread/719961

More relevant links:

https://openradar.appspot.com/radar?id=5536824084660224

https://github.com/python/cpython/issues/97524

Note in release notes: https://developer.apple.com/documentation/xcode-release-notes/xcode-13-releas    e-notes

(cherry picked from commit 8c0ea25fb4a27d4729aabf73f4c00b912bb0c58d)

- - - - -
f3c969a9 by Cheng Shao at 2023-02-03T02:01:23+05:30
Fix typo in recent darwin tests fix

Corrects a typo in !9647. Otherwise T18623 will still fail on darwin
and stall other people's work.

(cherry picked from commit c45a5fffef2c76efbf5d3a009c3f6d0244a63f0d)

- - - - -
59f2862d by Ben Gamari at 2023-02-03T02:01:23+05:30
nativeGen/AArch64: Fix debugging output

Previously various panics would rely on a half-written Show
instance, leading to very unhelpful errors. Fix this.

See #22798.

(cherry picked from commit be417a47c7695998dea0adc05489a7b8838a78b6)

- - - - -
f5f1a5cd by Ben Gamari at 2023-02-03T02:01:23+05:30
nativeGen: Teach graph-colouring allocator that x18 is unusable

Previously trivColourable for AArch64 claimed that at 18 registers were
trivially-colourable. This is incorrect as x18 is reserved by the platform on
AArch64/Darwin.

See #22798.

(cherry picked from commit 30989d137b8f3a8fddbfd116e04b48f23c24f86c)

- - - - -
ca157ac5 by Ben Gamari at 2023-02-03T02:01:23+05:30
nativeGen/AArch64: Fix graph-colouring allocator

Previously various `Instr` queries used by the graph-colouring allocator
failed to handle a few pseudo-instructions. This manifested in compiler
panicks while compiling `SHA`, which uses `-fregs-graph`.

Fixes #22798.

(cherry picked from commit 7566fd9de38c67360c090f828923d41587af519c)

- - - - -
e07afc94 by Ben Gamari at 2023-02-03T02:01:23+05:30
testsuite: Add regression test for #22798

(cherry picked from commit 2cb500a5ee1a31dfe1a2cdd71f175442026eb082)

- - - - -
ff480c77 by Zubin Duggal at 2023-02-03T02:01:23+05:30
hadrian: enable -haddock in perf flavour (#22734)

- - - - -
bcb79aea by Zubin Duggal at 2023-02-03T02:01:24+05:30
Fix warnings

- - - - -


13 changed files:

- compiler/GHC.hs
- compiler/GHC/CmmToAsm/AArch64/Instr.hs
- compiler/GHC/CmmToAsm/Reg/Graph/TrivColorable.hs
- compiler/cbits/keepCAFsForGHCi.c
- compiler/ghc.mk
- configure.ac
- hadrian/src/Settings/Flavours/Performance.hs
- + m4/fp_ld_no_fixup_chains.m4
- + testsuite/tests/codeGen/should_run/T22798.hs
- testsuite/tests/codeGen/should_run/all.T
- testsuite/tests/ghci/T16392/T16392.script
- testsuite/tests/ghci/linking/dyn/Makefile
- testsuite/tests/rts/T18623/all.T


Changes:

=====================================
compiler/GHC.hs
=====================================
@@ -554,7 +554,12 @@ withCleanupSession ghc = ghc `MC.finally` cleanup
 
 initGhcMonad :: GhcMonad m => Maybe FilePath -> m ()
 initGhcMonad mb_top_dir
-  = do { env <- liftIO $
+  = do { -- The call to c_keepCAFsForGHCi must not be optimized away. Even in non-debug builds.
+         -- So we can't use assertM here.
+         -- See Note [keepCAFsForGHCi] in keepCAFsForGHCi.c for details about why.
+         !keep_cafs <- liftIO $ c_keepCAFsForGHCi
+       ; MASSERT( keep_cafs )
+       ; env <- liftIO $
                 do { top_dir <- findTopDir mb_top_dir
                    ; mySettings <- initSysTools top_dir
                    ; myLlvmConfig <- lazyInitLlvmConfig top_dir
@@ -600,7 +605,6 @@ checkBrokenTablesNextToCode' logger dflags
         arch = platformArch platform
         tablesNextToCode = platformTablesNextToCode platform
 
-
 -- %************************************************************************
 -- %*                                                                      *
 --             Flags & settings
@@ -1931,3 +1935,5 @@ instance Exception GhcApiError
 mkApiErr :: DynFlags -> SDoc -> GhcApiError
 mkApiErr dflags msg = GhcApiError (showSDoc dflags msg)
 
+foreign import ccall unsafe "keepCAFsForGHCi"
+    c_keepCAFsForGHCi   :: IO Bool


=====================================
compiler/GHC/CmmToAsm/AArch64/Instr.hs
=====================================
@@ -73,6 +73,11 @@ instance Outputable RegUsage where
 regUsageOfInstr :: Platform -> Instr -> RegUsage
 regUsageOfInstr platform instr = case instr of
   ANN _ i                  -> regUsageOfInstr platform i
+  COMMENT{}                -> usage ([], [])
+  PUSH_STACK_FRAME         -> usage ([], [])
+  POP_STACK_FRAME          -> usage ([], [])
+  DELTA{}                  -> usage ([], [])
+
   -- 1. Arithmetic Instructions ------------------------------------------------
   ADD dst src1 src2        -> usage (regOp src1 ++ regOp src2, regOp dst)
   CMN l r                  -> usage (regOp l ++ regOp r, [])
@@ -137,7 +142,7 @@ regUsageOfInstr platform instr = case instr of
   FCVTZS dst src           -> usage (regOp src, regOp dst)
   FABS dst src             -> usage (regOp src, regOp dst)
 
-  _ -> panic "regUsageOfInstr"
+  _ -> panic $ "regUsageOfInstr: " ++ instrCon instr
 
   where
         -- filtering the usage is necessary, otherwise the register
@@ -203,7 +208,11 @@ callerSavedRegisters
 patchRegsOfInstr :: Instr -> (Reg -> Reg) -> Instr
 patchRegsOfInstr instr env = case instr of
     -- 0. Meta Instructions
-    ANN d i        -> ANN d (patchRegsOfInstr i env)
+    ANN d i          -> ANN d (patchRegsOfInstr i env)
+    COMMENT{}        -> instr
+    PUSH_STACK_FRAME -> instr
+    POP_STACK_FRAME  -> instr
+    DELTA{}          -> instr
     -- 1. Arithmetic Instructions ----------------------------------------------
     ADD o1 o2 o3   -> ADD (patchOp o1) (patchOp o2) (patchOp o3)
     CMN o1 o2      -> CMN (patchOp o1) (patchOp o2)
@@ -269,8 +278,7 @@ patchRegsOfInstr instr env = case instr of
     SCVTF o1 o2    -> SCVTF (patchOp o1) (patchOp o2)
     FCVTZS o1 o2   -> FCVTZS (patchOp o1) (patchOp o2)
     FABS o1 o2     -> FABS (patchOp o1) (patchOp o2)
-
-    _ -> pprPanic "patchRegsOfInstr" (text $ show instr)
+    _              -> panic $ "patchRegsOfInstr: " ++ instrCon instr
     where
         patchOp :: Operand -> Operand
         patchOp (OpReg w r) = OpReg w (env r)
@@ -326,7 +334,7 @@ patchJumpInstr instr patchF
         B (TBlock bid) -> B (TBlock (patchF bid))
         BL (TBlock bid) ps rs -> BL (TBlock (patchF bid)) ps rs
         BCOND c (TBlock bid) -> BCOND c (TBlock (patchF bid))
-        _ -> pprPanic "patchJumpInstr" (text $ show instr)
+        _ -> panic $ "patchJumpInstr: " ++ instrCon instr
 
 -- -----------------------------------------------------------------------------
 -- Note [Spills and Reloads]
@@ -638,10 +646,69 @@ data Instr
     -- Float ABSolute value
     | FABS Operand Operand
 
-instance Show Instr where
-    show (LDR _f o1 o2) = "LDR " ++ show o1 ++ ", " ++ show o2
-    show (MOV o1 o2) = "MOV " ++ show o1 ++ ", " ++ show o2
-    show _ = "missing"
+instrCon :: Instr -> String
+instrCon i =
+    case i of
+      COMMENT{} -> "COMMENT"
+      MULTILINE_COMMENT{} -> "COMMENT"
+      ANN{} -> "ANN"
+      LOCATION{} -> "LOCATION"
+      LDATA{} -> "LDATA"
+      NEWBLOCK{} -> "NEWBLOCK"
+      DELTA{} -> "DELTA"
+      SXTB{} -> "SXTB"
+      UXTB{} -> "UXTB"
+      SXTH{} -> "SXTH"
+      UXTH{} -> "UXTH"
+      PUSH_STACK_FRAME{} -> "PUSH_STACK_FRAME"
+      POP_STACK_FRAME{} -> "POP_STACK_FRAME"
+      ADD{} -> "ADD"
+      CMN{} -> "CMN"
+      CMP{} -> "CMP"
+      MSUB{} -> "MSUB"
+      MUL{} -> "MUL"
+      NEG{} -> "NEG"
+      SDIV{} -> "SDIV"
+      SMULH{} -> "SMULH"
+      SMULL{} -> "SMULL"
+      SUB{} -> "SUB"
+      UDIV{} -> "UDIV"
+      SBFM{} -> "SBFM"
+      UBFM{} -> "UBFM"
+      SBFX{} -> "SBFX"
+      UBFX{} -> "UBFX"
+      AND{} -> "AND"
+      ANDS{} -> "ANDS"
+      ASR{} -> "ASR"
+      BIC{} -> "BIC"
+      BICS{} -> "BICS"
+      EON{} -> "EON"
+      EOR{} -> "EOR"
+      LSL{} -> "LSL"
+      LSR{} -> "LSR"
+      MOV{} -> "MOV"
+      MOVK{} -> "MOVK"
+      MVN{} -> "MVN"
+      ORN{} -> "ORN"
+      ORR{} -> "ORR"
+      ROR{} -> "ROR"
+      TST{} -> "TST"
+      STR{} -> "STR"
+      LDR{} -> "LDR"
+      STP{} -> "STP"
+      LDP{} -> "LDP"
+      CSET{} -> "CSET"
+      CBZ{} -> "CBZ"
+      CBNZ{} -> "CBNZ"
+      J{} -> "J"
+      B{} -> "B"
+      BL{} -> "BL"
+      BCOND{} -> "BCOND"
+      DMBSY{} -> "DMBSY"
+      FCVT{} -> "FCVT"
+      SCVTF{} -> "SCVTF"
+      FCVTZS{} -> "FCVTZS"
+      FABS{} -> "FABS"
 
 data Target
     = TBlock BlockId
@@ -769,11 +836,11 @@ opRegUExt W64 r = OpRegExt W64 r EUXTX 0
 opRegUExt W32 r = OpRegExt W32 r EUXTW 0
 opRegUExt W16 r = OpRegExt W16 r EUXTH 0
 opRegUExt W8  r = OpRegExt W8  r EUXTB 0
-opRegUExt w  _r = pprPanic "opRegUExt" (text $ show w)
+opRegUExt w  _r = pprPanic "opRegUExt" (ppr w)
 
 opRegSExt :: Width -> Reg -> Operand
 opRegSExt W64 r = OpRegExt W64 r ESXTX 0
 opRegSExt W32 r = OpRegExt W32 r ESXTW 0
 opRegSExt W16 r = OpRegExt W16 r ESXTH 0
 opRegSExt W8  r = OpRegExt W8  r ESXTB 0
-opRegSExt w  _r = pprPanic "opRegSExt" (text $ show w)
+opRegSExt w  _r = pprPanic "opRegSExt" (ppr w)


=====================================
compiler/GHC/CmmToAsm/Reg/Graph/TrivColorable.hs
=====================================
@@ -115,10 +115,8 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcInteger conflicts excl
                             ArchSPARC64   -> panic "trivColorable ArchSPARC64"
                             ArchPPC_64 _  -> 15
                             ArchARM _ _ _ -> panic "trivColorable ArchARM"
-                            -- We should be able to allocate *a lot* more in princple.
-                            -- essentially all 32 - SP, so 31, we'd trash the link reg
-                            -- as well as the platform and all others though.
-                            ArchAArch64   -> 18
+                            -- N.B. x18 is reserved by the platform on AArch64/Darwin
+                            ArchAArch64   -> 17
                             ArchAlpha     -> panic "trivColorable ArchAlpha"
                             ArchMipseb    -> panic "trivColorable ArchMipseb"
                             ArchMipsel    -> panic "trivColorable ArchMipsel"


=====================================
compiler/cbits/keepCAFsForGHCi.c
=====================================
@@ -1,15 +1,35 @@
 #include <Rts.h>
+#include <ghcversion.h>
 
+// Note [keepCAFsForGHCi]
+// ~~~~~~~~~~~~~~~~~~~~~~
 // This file is only included in the dynamic library.
 // It contains an __attribute__((constructor)) function (run prior to main())
 // which sets the keepCAFs flag in the RTS, before any Haskell code is run.
 // This is required so that GHCi can use dynamic libraries instead of HSxyz.o
 // files.
+//
+// For static builds we have to guarantee that the linker loads this object file
+// to ensure the constructor gets run and not discarded. If the object is part of
+// an archive and not otherwise referenced the linker would ignore the object.
+// To avoid this:
+// * When initializing a GHC session in initGhcMonad we assert keeping cafs has been
+//   enabled by calling keepCAFsForGHCi.
+// * This causes the GHC module from the ghc package to carry a reference to this object
+//   file.
+// * Which in turn ensures the linker doesn't discard this object file, causing
+//   the constructor to be run, allowing the assertion to succeed in the first place
+//   as keepCAFs will have been set already during initialization of constructors.
 
-static void keepCAFsForGHCi(void) __attribute__((constructor));
 
-static void keepCAFsForGHCi(void)
+
+bool keepCAFsForGHCi(void) __attribute__((constructor));
+
+bool keepCAFsForGHCi(void)
 {
-    keepCAFs = 1;
+    bool was_set = keepCAFs;
+    setKeepCAFs();
+    return was_set;
 }
 
+


=====================================
compiler/ghc.mk
=====================================
@@ -288,20 +288,6 @@ $(eval $(call build-package,compiler,stage1,0))
 $(eval $(call build-package,compiler,stage2,1))
 $(eval $(call build-package,compiler,stage3,2))
 
-# We only want to turn keepCAFs on if we will be loading dynamic
-# Haskell libraries with GHCi. We therefore filter the object file
-# out for non-dynamic ways.
-define keepCAFsForGHCiDynOnly
-# $1 = stage
-# $2 = way
-ifeq "$$(findstring dyn, $2)" ""
-compiler_stage$1_$2_C_OBJS := $$(filter-out %/keepCAFsForGHCi.$$($2_osuf),$$(compiler_stage$1_$2_C_OBJS))
-endif
-endef
-$(foreach w,$(compiler_stage1_WAYS),$(eval $(call keepCAFsForGHCiDynOnly,1,$w)))
-$(foreach w,$(compiler_stage2_WAYS),$(eval $(call keepCAFsForGHCiDynOnly,2,$w)))
-$(foreach w,$(compiler_stage3_WAYS),$(eval $(call keepCAFsForGHCiDynOnly,3,$w)))
-
 # after build-package, because that adds --enable-library-for-ghci
 # to compiler_stage*_CONFIGURE_OPTS:
 # We don't build the GHCi library for the ghc package. We can load it


=====================================
configure.ac
=====================================
@@ -780,6 +780,10 @@ FPTOOLS_SET_C_LD_FLAGS([target],[CONF_CC_OPTS_STAGE1],[CONF_GCC_LINKER_OPTS_STAG
 FPTOOLS_SET_C_LD_FLAGS([target],[CONF_CC_OPTS_STAGE2],[CONF_GCC_LINKER_OPTS_STAGE2],[CONF_LD_LINKER_OPTS_STAGE2],[CONF_CPP_OPTS_STAGE2])
 # Stage 3 won't be supported by cross-compilation
 
+FP_LD_NO_FIXUP_CHAINS([target], [LDFLAGS])
+FP_LD_NO_FIXUP_CHAINS([build], [CONF_GCC_LINKER_OPTS_STAGE0])
+FP_LD_NO_FIXUP_CHAINS([target], [CONF_GCC_LINKER_OPTS_STAGE1])
+FP_LD_NO_FIXUP_CHAINS([target], [CONF_GCC_LINKER_OPTS_STAGE2])
 dnl ** See whether cc supports --target=<triple> and set
 dnl CONF_CC_OPTS_STAGE[12] accordingly.
 FP_CC_SUPPORTS_TARGET


=====================================
hadrian/src/Settings/Flavours/Performance.hs
=====================================
@@ -13,6 +13,6 @@ performanceFlavour = defaultFlavour
 performanceArgs :: Args
 performanceArgs = sourceArgs SourceArgs
     { hsDefault  = pure ["-O", "-H64m"]
-    , hsLibrary  = notStage0 ? arg "-O2"
+    , hsLibrary  = mconcat [notStage0 ? arg "-O2", notStage0 ? arg "-haddock"]
     , hsCompiler = pure ["-O2"]
     , hsGhc      = mconcat [stage0 ? arg "-O", notStage0 ? arg "-O2"] }


=====================================
m4/fp_ld_no_fixup_chains.m4
=====================================
@@ -0,0 +1,24 @@
+# FP_LD_NO_FIXUP_CHAINS
+# --------------------
+# See if whether we are using a version of ld64 on darwin platforms which
+# requires us to pass -no_fixup_chains
+#
+# $1 = the platform
+# $2 = the name of the linker flags variable when linking with $CC
+AC_DEFUN([FP_LD_NO_FIXUP_CHAINS], [
+    case $$1 in
+      *-darwin)
+      AC_MSG_CHECKING([whether ld64 requires -no_fixup_chains])
+      echo 'int main(void) {return 0;}' > conftest.c
+      if $CC -o conftest.o -Wl,-no_fixup_chains conftest.c > /dev/null 2>&1
+      then
+          $2="-Wl,-no_fixup_chains"
+          AC_MSG_RESULT([yes])
+      else
+          AC_MSG_RESULT([no])
+      fi
+      rm -f conftest.c conftest.o
+      ;;
+
+    esac
+])


=====================================
testsuite/tests/codeGen/should_run/T22798.hs
=====================================
@@ -0,0 +1,375 @@
+-- Derived from SHA-1.5.0.0
+-- This previously uncovered cases left unhandled in the AArch64 NCG (#22798).
+
+{-# LANGUAGE BangPatterns, CPP, FlexibleInstances #-}
+module Main (main) where
+
+import Data.Binary
+import Data.Binary.Get
+import Data.Binary.Put
+import Data.Bits
+import Data.ByteString.Lazy(ByteString)
+import Data.ByteString.Lazy.Char8 as BSC (pack)
+import qualified Data.ByteString.Lazy as BS
+import Data.Char (intToDigit)
+import Control.Monad
+
+newtype Digest t = Digest ByteString
+
+data SHA512State = SHA512S !Word64 !Word64 !Word64 !Word64
+                           !Word64 !Word64 !Word64 !Word64
+
+initialSHA512State :: SHA512State
+initialSHA512State = SHA512S 0x6a09e667f3bcc908 0xbb67ae8584caa73b
+                             0x3c6ef372fe94f82b 0xa54ff53a5f1d36f1
+                             0x510e527fade682d1 0x9b05688c2b3e6c1f
+                             0x1f83d9abfb41bd6b 0x5be0cd19137e2179
+
+
+synthesizeSHA512 :: SHA512State -> Put
+synthesizeSHA512 (SHA512S a b c d e f g h) = do
+  putWord64be a
+  putWord64be b
+  putWord64be c
+  putWord64be d
+  putWord64be e
+  putWord64be f
+  putWord64be g
+  putWord64be h
+
+getSHA512 :: Get SHA512State
+getSHA512 = do
+  a <- getWord64be
+  b <- getWord64be
+  c <- getWord64be
+  d <- getWord64be
+  e <- getWord64be
+  f <- getWord64be
+  g <- getWord64be
+  h <- getWord64be
+  return $ SHA512S a b c d e f g h
+
+instance Binary SHA512State where
+  put = synthesizeSHA512
+  get = getSHA512
+
+padSHA512 :: ByteString -> ByteString
+padSHA512 = generic_pad 896 1024 128
+
+generic_pad :: Word64 -> Word64 -> Int -> ByteString -> ByteString
+generic_pad a b lSize bs = BS.concat [bs, pad_bytes, pad_length]
+ where
+  l = fromIntegral $ BS.length bs * 8
+  k = calc_k a b l
+  -- INVARIANT: k is necessarily > 0, and (k + 1) is a multiple of 8.
+  k_bytes    = (k + 1) `div` 8
+  pad_bytes  = BS.singleton 0x80 `BS.append` BS.replicate nZeroBytes 0
+  nZeroBytes = fromIntegral $ k_bytes - 1
+  pad_length = toBigEndianBS lSize l
+
+-- Given a, b, and l, calculate the smallest k such that (l + 1 + k) mod b = a.
+calc_k :: Word64 -> Word64 -> Word64 -> Word64
+calc_k a b l =
+  if r <= -1
+    then fromIntegral r + b
+    else fromIntegral r
+ where
+  r = toInteger a - toInteger l `mod` toInteger b - 1
+
+toBigEndianBS :: (Integral a, Bits a) => Int -> a -> ByteString
+toBigEndianBS s val = BS.pack $ map getBits [s - 8, s - 16 .. 0]
+ where
+   getBits x = fromIntegral $ (val `shiftR` x) .&. 0xFF
+
+{-# SPECIALIZE ch :: Word64 -> Word64 -> Word64 -> Word64 #-}
+ch :: Bits a => a -> a -> a -> a
+ch x y z = (x .&. y) `xor` (complement x .&. z)
+
+{-# SPECIALIZE maj :: Word64 -> Word64 -> Word64 -> Word64 #-}
+maj :: Bits a => a -> a -> a -> a
+maj x y z = (x .&. (y .|. z)) .|. (y .&. z)
+-- note:
+--   the original functions is (x & y) ^ (x & z) ^ (y & z)
+--   if you fire off truth tables, this is equivalent to
+--     (x & y) | (x & z) | (y & z)
+--   which you can the use distribution on:
+--     (x & (y | z)) | (y & z)
+--   which saves us one operation.
+
+bsig512_0 :: Word64 -> Word64
+bsig512_0 x = rotate x (-28) `xor` rotate x (-34) `xor` rotate x (-39)
+
+bsig512_1 :: Word64 -> Word64
+bsig512_1 x = rotate x (-14) `xor` rotate x (-18) `xor` rotate x (-41)
+
+lsig512_0 :: Word64 -> Word64
+lsig512_0 x = rotate x (-1) `xor` rotate x (-8) `xor` shiftR x 7
+
+lsig512_1 :: Word64 -> Word64
+lsig512_1 x = rotate x (-19) `xor` rotate x (-61) `xor` shiftR x 6
+
+data SHA512Sched = SHA512Sched !Word64 !Word64 !Word64 !Word64 !Word64 --  0- 4
+                               !Word64 !Word64 !Word64 !Word64 !Word64 --  5- 9
+                               !Word64 !Word64 !Word64 !Word64 !Word64 -- 10-14
+                               !Word64 !Word64 !Word64 !Word64 !Word64 -- 15-19
+                               !Word64 !Word64 !Word64 !Word64 !Word64 -- 20-24
+                               !Word64 !Word64 !Word64 !Word64 !Word64 -- 25-29
+                               !Word64 !Word64 !Word64 !Word64 !Word64 -- 30-34
+                               !Word64 !Word64 !Word64 !Word64 !Word64 -- 35-39
+                               !Word64 !Word64 !Word64 !Word64 !Word64 -- 40-44
+                               !Word64 !Word64 !Word64 !Word64 !Word64 -- 45-49
+                               !Word64 !Word64 !Word64 !Word64 !Word64 -- 50-54
+                               !Word64 !Word64 !Word64 !Word64 !Word64 -- 55-59
+                               !Word64 !Word64 !Word64 !Word64 !Word64 -- 60-64
+                               !Word64 !Word64 !Word64 !Word64 !Word64 -- 65-69
+                               !Word64 !Word64 !Word64 !Word64 !Word64 -- 70-74
+                               !Word64 !Word64 !Word64 !Word64 !Word64 -- 75-79
+
+getSHA512Sched :: Get SHA512Sched
+getSHA512Sched = do
+  w00 <- getWord64be
+  w01 <- getWord64be
+  w02 <- getWord64be
+  w03 <- getWord64be
+  w04 <- getWord64be
+  w05 <- getWord64be
+  w06 <- getWord64be
+  w07 <- getWord64be
+  w08 <- getWord64be
+  w09 <- getWord64be
+  w10 <- getWord64be
+  w11 <- getWord64be
+  w12 <- getWord64be
+  w13 <- getWord64be
+  w14 <- getWord64be
+  w15 <- getWord64be
+  let w16 = lsig512_1 w14 + w09 + lsig512_0 w01 + w00
+      w17 = lsig512_1 w15 + w10 + lsig512_0 w02 + w01
+      w18 = lsig512_1 w16 + w11 + lsig512_0 w03 + w02
+      w19 = lsig512_1 w17 + w12 + lsig512_0 w04 + w03
+      w20 = lsig512_1 w18 + w13 + lsig512_0 w05 + w04
+      w21 = lsig512_1 w19 + w14 + lsig512_0 w06 + w05
+      w22 = lsig512_1 w20 + w15 + lsig512_0 w07 + w06
+      w23 = lsig512_1 w21 + w16 + lsig512_0 w08 + w07
+      w24 = lsig512_1 w22 + w17 + lsig512_0 w09 + w08
+      w25 = lsig512_1 w23 + w18 + lsig512_0 w10 + w09
+      w26 = lsig512_1 w24 + w19 + lsig512_0 w11 + w10
+      w27 = lsig512_1 w25 + w20 + lsig512_0 w12 + w11
+      w28 = lsig512_1 w26 + w21 + lsig512_0 w13 + w12
+      w29 = lsig512_1 w27 + w22 + lsig512_0 w14 + w13
+      w30 = lsig512_1 w28 + w23 + lsig512_0 w15 + w14
+      w31 = lsig512_1 w29 + w24 + lsig512_0 w16 + w15
+      w32 = lsig512_1 w30 + w25 + lsig512_0 w17 + w16
+      w33 = lsig512_1 w31 + w26 + lsig512_0 w18 + w17
+      w34 = lsig512_1 w32 + w27 + lsig512_0 w19 + w18
+      w35 = lsig512_1 w33 + w28 + lsig512_0 w20 + w19
+      w36 = lsig512_1 w34 + w29 + lsig512_0 w21 + w20
+      w37 = lsig512_1 w35 + w30 + lsig512_0 w22 + w21
+      w38 = lsig512_1 w36 + w31 + lsig512_0 w23 + w22
+      w39 = lsig512_1 w37 + w32 + lsig512_0 w24 + w23
+      w40 = lsig512_1 w38 + w33 + lsig512_0 w25 + w24
+      w41 = lsig512_1 w39 + w34 + lsig512_0 w26 + w25
+      w42 = lsig512_1 w40 + w35 + lsig512_0 w27 + w26
+      w43 = lsig512_1 w41 + w36 + lsig512_0 w28 + w27
+      w44 = lsig512_1 w42 + w37 + lsig512_0 w29 + w28
+      w45 = lsig512_1 w43 + w38 + lsig512_0 w30 + w29
+      w46 = lsig512_1 w44 + w39 + lsig512_0 w31 + w30
+      w47 = lsig512_1 w45 + w40 + lsig512_0 w32 + w31
+      w48 = lsig512_1 w46 + w41 + lsig512_0 w33 + w32
+      w49 = lsig512_1 w47 + w42 + lsig512_0 w34 + w33
+      w50 = lsig512_1 w48 + w43 + lsig512_0 w35 + w34
+      w51 = lsig512_1 w49 + w44 + lsig512_0 w36 + w35
+      w52 = lsig512_1 w50 + w45 + lsig512_0 w37 + w36
+      w53 = lsig512_1 w51 + w46 + lsig512_0 w38 + w37
+      w54 = lsig512_1 w52 + w47 + lsig512_0 w39 + w38
+      w55 = lsig512_1 w53 + w48 + lsig512_0 w40 + w39
+      w56 = lsig512_1 w54 + w49 + lsig512_0 w41 + w40
+      w57 = lsig512_1 w55 + w50 + lsig512_0 w42 + w41
+      w58 = lsig512_1 w56 + w51 + lsig512_0 w43 + w42
+      w59 = lsig512_1 w57 + w52 + lsig512_0 w44 + w43
+      w60 = lsig512_1 w58 + w53 + lsig512_0 w45 + w44
+      w61 = lsig512_1 w59 + w54 + lsig512_0 w46 + w45
+      w62 = lsig512_1 w60 + w55 + lsig512_0 w47 + w46
+      w63 = lsig512_1 w61 + w56 + lsig512_0 w48 + w47
+      w64 = lsig512_1 w62 + w57 + lsig512_0 w49 + w48
+      w65 = lsig512_1 w63 + w58 + lsig512_0 w50 + w49
+      w66 = lsig512_1 w64 + w59 + lsig512_0 w51 + w50
+      w67 = lsig512_1 w65 + w60 + lsig512_0 w52 + w51
+      w68 = lsig512_1 w66 + w61 + lsig512_0 w53 + w52
+      w69 = lsig512_1 w67 + w62 + lsig512_0 w54 + w53
+      w70 = lsig512_1 w68 + w63 + lsig512_0 w55 + w54
+      w71 = lsig512_1 w69 + w64 + lsig512_0 w56 + w55
+      w72 = lsig512_1 w70 + w65 + lsig512_0 w57 + w56
+      w73 = lsig512_1 w71 + w66 + lsig512_0 w58 + w57
+      w74 = lsig512_1 w72 + w67 + lsig512_0 w59 + w58
+      w75 = lsig512_1 w73 + w68 + lsig512_0 w60 + w59
+      w76 = lsig512_1 w74 + w69 + lsig512_0 w61 + w60
+      w77 = lsig512_1 w75 + w70 + lsig512_0 w62 + w61
+      w78 = lsig512_1 w76 + w71 + lsig512_0 w63 + w62
+      w79 = lsig512_1 w77 + w72 + lsig512_0 w64 + w63
+  return $ SHA512Sched w00 w01 w02 w03 w04 w05 w06 w07 w08 w09
+                       w10 w11 w12 w13 w14 w15 w16 w17 w18 w19
+                       w20 w21 w22 w23 w24 w25 w26 w27 w28 w29
+                       w30 w31 w32 w33 w34 w35 w36 w37 w38 w39
+                       w40 w41 w42 w43 w44 w45 w46 w47 w48 w49
+                       w50 w51 w52 w53 w54 w55 w56 w57 w58 w59
+                       w60 w61 w62 w63 w64 w65 w66 w67 w68 w69
+                       w70 w71 w72 w73 w74 w75 w76 w77 w78 w79
+
+processSHA512Block :: SHA512State -> Get SHA512State
+processSHA512Block !s00@(SHA512S a00 b00 c00 d00 e00 f00 g00 h00) = do
+  (SHA512Sched w00 w01 w02 w03 w04 w05 w06 w07 w08 w09
+               w10 w11 w12 w13 w14 w15 w16 w17 w18 w19
+               w20 w21 w22 w23 w24 w25 w26 w27 w28 w29
+               w30 w31 w32 w33 w34 w35 w36 w37 w38 w39
+               w40 w41 w42 w43 w44 w45 w46 w47 w48 w49
+               w50 w51 w52 w53 w54 w55 w56 w57 w58 w59
+               w60 w61 w62 w63 w64 w65 w66 w67 w68 w69
+               w70 w71 w72 w73 w74 w75 w76 w77 w78 w79) <- getSHA512Sched
+  let s01 = step512 s00 0x428a2f98d728ae22 w00
+      s02 = step512 s01 0x7137449123ef65cd w01
+      s03 = step512 s02 0xb5c0fbcfec4d3b2f w02
+      s04 = step512 s03 0xe9b5dba58189dbbc w03
+      s05 = step512 s04 0x3956c25bf348b538 w04
+      s06 = step512 s05 0x59f111f1b605d019 w05
+      s07 = step512 s06 0x923f82a4af194f9b w06
+      s08 = step512 s07 0xab1c5ed5da6d8118 w07
+      s09 = step512 s08 0xd807aa98a3030242 w08
+      s10 = step512 s09 0x12835b0145706fbe w09
+      s11 = step512 s10 0x243185be4ee4b28c w10
+      s12 = step512 s11 0x550c7dc3d5ffb4e2 w11
+      s13 = step512 s12 0x72be5d74f27b896f w12
+      s14 = step512 s13 0x80deb1fe3b1696b1 w13
+      s15 = step512 s14 0x9bdc06a725c71235 w14
+      s16 = step512 s15 0xc19bf174cf692694 w15
+      s17 = step512 s16 0xe49b69c19ef14ad2 w16
+      s18 = step512 s17 0xefbe4786384f25e3 w17
+      s19 = step512 s18 0x0fc19dc68b8cd5b5 w18
+      s20 = step512 s19 0x240ca1cc77ac9c65 w19
+      s21 = step512 s20 0x2de92c6f592b0275 w20
+      s22 = step512 s21 0x4a7484aa6ea6e483 w21
+      s23 = step512 s22 0x5cb0a9dcbd41fbd4 w22
+      s24 = step512 s23 0x76f988da831153b5 w23
+      s25 = step512 s24 0x983e5152ee66dfab w24
+      s26 = step512 s25 0xa831c66d2db43210 w25
+      s27 = step512 s26 0xb00327c898fb213f w26
+      s28 = step512 s27 0xbf597fc7beef0ee4 w27
+      s29 = step512 s28 0xc6e00bf33da88fc2 w28
+      s30 = step512 s29 0xd5a79147930aa725 w29
+      s31 = step512 s30 0x06ca6351e003826f w30
+      s32 = step512 s31 0x142929670a0e6e70 w31
+      s33 = step512 s32 0x27b70a8546d22ffc w32
+      s34 = step512 s33 0x2e1b21385c26c926 w33
+      s35 = step512 s34 0x4d2c6dfc5ac42aed w34
+      s36 = step512 s35 0x53380d139d95b3df w35
+      s37 = step512 s36 0x650a73548baf63de w36
+      s38 = step512 s37 0x766a0abb3c77b2a8 w37
+      s39 = step512 s38 0x81c2c92e47edaee6 w38
+      s40 = step512 s39 0x92722c851482353b w39
+      s41 = step512 s40 0xa2bfe8a14cf10364 w40
+      s42 = step512 s41 0xa81a664bbc423001 w41
+      s43 = step512 s42 0xc24b8b70d0f89791 w42
+      s44 = step512 s43 0xc76c51a30654be30 w43
+      s45 = step512 s44 0xd192e819d6ef5218 w44
+      s46 = step512 s45 0xd69906245565a910 w45
+      s47 = step512 s46 0xf40e35855771202a w46
+      s48 = step512 s47 0x106aa07032bbd1b8 w47
+      s49 = step512 s48 0x19a4c116b8d2d0c8 w48
+      s50 = step512 s49 0x1e376c085141ab53 w49
+      s51 = step512 s50 0x2748774cdf8eeb99 w50
+      s52 = step512 s51 0x34b0bcb5e19b48a8 w51
+      s53 = step512 s52 0x391c0cb3c5c95a63 w52
+      s54 = step512 s53 0x4ed8aa4ae3418acb w53
+      s55 = step512 s54 0x5b9cca4f7763e373 w54
+      s56 = step512 s55 0x682e6ff3d6b2b8a3 w55
+      s57 = step512 s56 0x748f82ee5defb2fc w56
+      s58 = step512 s57 0x78a5636f43172f60 w57
+      s59 = step512 s58 0x84c87814a1f0ab72 w58
+      s60 = step512 s59 0x8cc702081a6439ec w59
+      s61 = step512 s60 0x90befffa23631e28 w60
+      s62 = step512 s61 0xa4506cebde82bde9 w61
+      s63 = step512 s62 0xbef9a3f7b2c67915 w62
+      s64 = step512 s63 0xc67178f2e372532b w63
+      s65 = step512 s64 0xca273eceea26619c w64
+      s66 = step512 s65 0xd186b8c721c0c207 w65
+      s67 = step512 s66 0xeada7dd6cde0eb1e w66
+      s68 = step512 s67 0xf57d4f7fee6ed178 w67
+      s69 = step512 s68 0x06f067aa72176fba w68
+      s70 = step512 s69 0x0a637dc5a2c898a6 w69
+      s71 = step512 s70 0x113f9804bef90dae w70
+      s72 = step512 s71 0x1b710b35131c471b w71
+      s73 = step512 s72 0x28db77f523047d84 w72
+      s74 = step512 s73 0x32caab7b40c72493 w73
+      s75 = step512 s74 0x3c9ebe0a15c9bebc w74
+      s76 = step512 s75 0x431d67c49c100d4c w75
+      s77 = step512 s76 0x4cc5d4becb3e42b6 w76
+      s78 = step512 s77 0x597f299cfc657e2a w77
+      s79 = step512 s78 0x5fcb6fab3ad6faec w78
+      s80 = step512 s79 0x6c44198c4a475817 w79
+      SHA512S a80 b80 c80 d80 e80 f80 g80 h80 = s80
+  return $ SHA512S (a00 + a80) (b00 + b80) (c00 + c80) (d00 + d80)
+                   (e00 + e80) (f00 + f80) (g00 + g80) (h00 + h80)
+
+{-# INLINE step512 #-}
+step512 :: SHA512State -> Word64 -> Word64 -> SHA512State
+step512 !(SHA512S a b c d e f g h) k w = SHA512S a' b' c' d' e' f' g' h'
+ where
+  t1 = h + bsig512_1 e + ch e f g + k + w
+  t2 = bsig512_0 a + maj a b c
+  h' = g
+  g' = f
+  f' = e
+  e' = d + t1
+  d' = c
+  c' = b
+  b' = a
+  a' = t1 + t2
+
+runSHA :: a -> (a -> Get a) -> ByteString -> a
+runSHA s nextChunk input = runGet (getAll s) input
+ where
+  getAll s_in = do
+    done <- isEmpty
+    if done
+      then return s_in
+      else nextChunk s_in >>= getAll
+
+sha512 :: ByteString -> Digest SHA512State
+sha512 bs_in = Digest bs_out
+ where
+  bs_pad = padSHA512 bs_in
+  fstate = runSHA initialSHA512State processSHA512Block bs_pad
+  bs_out = runPut $ synthesizeSHA512 fstate
+
+sha512_spec_tests :: [(String, String)]
+sha512_spec_tests =
+ [("abc",
+   "ddaf35a193617abacc417349ae20413112e6fa4e89a97ea20a9eeee64b55d39a" ++
+   "2192992a274fc1a836ba3c23a3feebbd454d4423643ce80e2a9ac94fa54ca49f"),
+  ("abcdefghbcdefghicdefghijdefghijkefghijklfghijklmghijklmn" ++
+   "hijklmnoijklmnopjklmnopqklmnopqrlmnopqrsmnopqrstnopqrstu",
+   "8e959b75dae313da8cf4f72814fc143f8f7779c6eb9f7fa17299aeadb6889018" ++
+   "501d289e4900f7e4331b99dec4b5433ac7d329eeb6dd26545e96e55b874be909"),
+  (replicate 1000000 'a',
+   "e718483d0ce769644e2e42c7bc15b4638e1f98b13b2044285632a803afa973eb" ++
+   "de0ff244877ea60a4cb0432ce577c31beb009c5c2c49aa2e4eadb217ad8cc09b")]
+
+showDigest :: Digest t -> String
+showDigest (Digest bs) = showDigestBS bs
+
+-- |Prints out a bytestring in hexadecimal. Just for convenience.
+showDigestBS :: ByteString -> String
+showDigestBS bs = foldr paddedShowHex [] (BS.unpack bs)
+ where
+   paddedShowHex x xs = intToDigit (fromIntegral (x `shiftR` 4))
+                      : intToDigit (fromIntegral (x .&. 0xf))
+                      : xs
+
+main :: IO ()
+main = do
+    sequence_
+        [ unless (digest == expected)
+            $ fail $ "failed: " ++ expected ++ " /= " ++ digest
+        | (str, expected) <- sha512_spec_tests
+        , let digest = showDigest (sha512 $ BSC.pack str)
+        ]


=====================================
testsuite/tests/codeGen/should_run/all.T
=====================================
@@ -219,3 +219,4 @@ test('CallConv', [when(unregisterised(), skip),
                   when(arch('x86_64'), extra_hc_opts('CallConv_x86_64.s')),
                   when(arch('aarch64'), extra_hc_opts('CallConv_aarch64.s'))],
      compile_and_run, [''])
+test('T22798', normal, compile_and_run, ['-fregs-graph'])


=====================================
testsuite/tests/ghci/T16392/T16392.script
=====================================
@@ -1,5 +1,7 @@
 :set -fobject-code
+import System.Mem
 :load A.hs
 c_two caf
+performMajorGC
 :load A.hs
 c_two caf


=====================================
testsuite/tests/ghci/linking/dyn/Makefile
=====================================
@@ -74,7 +74,7 @@ compile_libAB_dyn:
 	'$(TEST_HC)' $(MY_TEST_HC_OPTS) -odir "bin_dyn" -shared B.c -o "bin_dyn/$(call DLL,B)" -lA -L"./bin_dyn"
 	rm -f bin_dyn/*.a
 	'$(TEST_HC)' $(TEST_HC_OPTS) -ignore-dot-ghci -v0 -o "bin_dyn/$(call EXE,T10955dyn)" -L./bin_dyn -lB -lA T10955dyn.hs -v0
-	LD_LIBRARY_PATH=./bin_dyn ./bin_dyn/$(call EXE,T10955dyn)
+	DYLD_LIBRARY_PATH=./bin_dyn LD_LIBRARY_PATH=./bin_dyn ./bin_dyn/$(call EXE,T10955dyn)
 
 .PHONY: compile_libAS_impl_gcc
 compile_libAS_impl_gcc:


=====================================
testsuite/tests/rts/T18623/all.T
=====================================
@@ -5,7 +5,10 @@ test('T18623',
      # This keeps failing on aarch64-linux for reasons that are not
      # fully clear.  Maybe it needs a higher limit due to LLMV?
      when(arch('aarch64'), skip),
+     # Recent versions of osx report an error when running `ulimit -v`
+     when(opsys('darwin'), skip),
+     when(arch('powerpc64le'), skip),
      cmd_prefix('ulimit -v ' + str(1024 ** 2) + ' && '),
      ignore_stdout],
     run_command,
-    ['{compiler} --version'])
\ No newline at end of file
+    ['{compiler} --version'])



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/86e659f1fe85a4e6d46a439efb664b9dac152e38...bcb79aeadf0af55e4415896b70c893c8ff4ca87f

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/86e659f1fe85a4e6d46a439efb664b9dac152e38...bcb79aeadf0af55e4415896b70c893c8ff4ca87f
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20230202/1fc1b1b7/attachment-0001.html>


More information about the ghc-commits mailing list