[Git][ghc/ghc][wip/js-rts-fixmes] 9 commits: docs: 9.6 release notes for wasm backend

Josh Meredith (@JoshMeredith) gitlab at gitlab.haskell.org
Fri Feb 3 14:17:23 UTC 2023



Josh Meredith pushed to branch wip/js-rts-fixmes at Glasgow Haskell Compiler / GHC


Commits:
1ffe770c by Cheng Shao at 2023-02-02T09:40:38+00:00
docs: 9.6 release notes for wasm backend

- - - - -
0ada4547 by Matthew Pickering at 2023-02-02T11:39:44-05:00
Disable unfolding sharing for interface files with core definitions

Ticket #22807 pointed out that the RHS sharing was not compatible with
-fignore-interface-pragmas because the flag would remove unfoldings from
identifiers before the `extra-decls` field was populated.

For the 9.6 timescale the only solution is to disable this sharing,
which will make interface files bigger but this is acceptable for the
first release of `-fwrite-if-simplified-core`.

For 9.8 it would be good to fix this by implementing #20056 due to the
large number of other bugs that would fix.

I also improved the error message in tc_iface_binding to avoid the "no match
in record selector" error but it should never happen now as the entire
sharing logic is disabled.

Also added the currently broken test for #22807 which could be fixed by
!6080

Fixes #22807

- - - - -
7e2d3eb5 by lrzlin at 2023-02-03T05:23:27-05:00
Enable tables next to code for LoongArch64

- - - - -
2931712a by Wander Hillen at 2023-02-03T05:24:06-05:00
Move pthread and timerfd ticker implementations to separate files

- - - - -
41c4baf8 by Ben Gamari at 2023-02-03T05:24:44-05:00
base: Fix Note references in GHC.IO.Handle.Types

- - - - -
31358198 by Bodigrim at 2023-02-03T05:25:22-05:00
Bump submodule containers to 0.6.7

Metric Decrease:
    ManyConstructors
    T10421
    T12425
    T12707
    T13035
    T13379
    T15164
    T1969
    T783
    T9198
    T9961
    WWRec

- - - - -
8feb9301 by Ben Gamari at 2023-02-03T05:25:59-05:00
gitlab-ci: Eliminate redundant ghc --info output

Previously ci.sh would emit the output of `ghc --info` every time it ran
when using the nix toolchain. This produced a significant amount of
noise.

See #22861.
- - - - -
c4d417cd by Josh Meredith at 2023-02-03T14:17:14+00:00
Factor JS Rts generation for h$c{_,0,1,2} into h$c{n}

- - - - -
ad3bfdb7 by Josh Meredith at 2023-02-03T14:17:14+00:00
Cache names used commonly in JS backend RTS generation

- - - - -


23 changed files:

- .gitlab/ci.sh
- compiler/GHC/CmmToLlvm/Mangler.hs
- compiler/GHC/CoreToIface.hs
- compiler/GHC/IfaceToCore.hs
- compiler/GHC/JS/Make.hs
- compiler/GHC/StgToJS/Rts/Rts.hs
- compiler/ghc.cabal.in
- docs/users_guide/9.6.1-notes.rst
- libraries/base/GHC/IO/Handle/Types.hs
- libraries/containers
- libraries/ghci/GHCi/InfoTable.hsc
- m4/ghc_tables_next_to_code.m4
- rts/posix/Ticker.c
- rts/posix/ticker/Pthread.c
- + rts/posix/ticker/TimerFd.c
- testsuite/tests/driver/fat-iface/Makefile
- + testsuite/tests/driver/fat-iface/T22807.stdout
- + testsuite/tests/driver/fat-iface/T22807A.hs
- + testsuite/tests/driver/fat-iface/T22807B.hs
- + testsuite/tests/driver/fat-iface/T22807_ghci.hs
- + testsuite/tests/driver/fat-iface/T22807_ghci.script
- + testsuite/tests/driver/fat-iface/T22807_ghci.stdout
- testsuite/tests/driver/fat-iface/all.T


Changes:

=====================================
.gitlab/ci.sh
=====================================
@@ -214,8 +214,6 @@ function set_toolchain_paths() {
         cat toolchain.sh
       fi
       source toolchain.sh
-      info "--info for GHC for $NIX_SYSTEM"
-      $GHC --info
       ;;
     env)
       # These are generally set by the Docker image but
@@ -274,6 +272,11 @@ function setup() {
   show_tool CABAL
   show_tool HAPPY
   show_tool ALEX
+
+  info "====================================================="
+  info "ghc --info"
+  info "====================================================="
+  $GHC --info
 }
 
 function fetch_ghc() {


=====================================
compiler/GHC/CmmToLlvm/Mangler.hs
=====================================
@@ -38,7 +38,7 @@ llvmFixupAsm platform f1 f2 = {-# SCC "llvm_mangler" #-}
 
 -- | These are the rewrites that the mangler will perform
 rewrites :: [Rewrite]
-rewrites = [rewriteSymType, rewriteAVX, rewriteCall]
+rewrites = [rewriteSymType, rewriteAVX, rewriteCall, rewriteJump]
 
 type Rewrite = Platform -> B.ByteString -> Maybe B.ByteString
 
@@ -123,6 +123,29 @@ rewriteCall platform l
         removePlt = replaceOnce (B.pack "@plt") (B.pack "")
         appendInsn i = (`B.append` B.pack ("\n\t" ++ i))
 
+-- | This rewrites bl and b jump inst to avoid creating PLT entries for
+-- functions on loongarch64, because there is no separate call instruction
+-- for function calls in loongarch64. Also, this replacement will load
+-- the function address from the GOT, which is resolved to point to the
+-- real address of the function.
+rewriteJump :: Rewrite
+rewriteJump platform l
+  | not isLoongArch64 = Nothing
+  | isBL l            = Just $ replaceJump "bl" "$ra" "$ra" l
+  | isB l             = Just $ replaceJump "b" "$zero" "$t0" l
+  | otherwise         = Nothing
+  where
+    isLoongArch64 = platformArch platform == ArchLoongArch64
+    isBL = B.isPrefixOf (B.pack "bl\t")
+    isB = B.isPrefixOf (B.pack "b\t")
+
+    replaceJump jump rd rj l =
+        appendInsn ("jirl" ++ "\t" ++ rd ++ ", " ++ rj ++ ", 0") $ removeBracket $
+        replaceOnce (B.pack (jump ++ "\t%plt(")) (B.pack ("la\t" ++ rj ++ ", ")) l
+      where
+        removeBracket = replaceOnce (B.pack ")") (B.pack "")
+        appendInsn i = (`B.append` B.pack ("\n\t" ++ i))
+
 -- | @replaceOnce match replace bs@ replaces the first occurrence of the
 -- substring @match@ in @bs@ with @replace at .
 replaceOnce :: B.ByteString -> B.ByteString -> B.ByteString -> B.ByteString


=====================================
compiler/GHC/CoreToIface.hs
=====================================
@@ -604,8 +604,12 @@ toIfaceTopBind b =
                       IfLclTopBndr {} -> IfRhs (toIfaceExpr rhs)
           in (top_bndr, rhs')
 
-        already_has_unfolding b =
-                                -- The identifier has an unfolding, which we are going to serialise anyway
+        -- The sharing behaviour is currently disabled due to #22807, and relies on
+        -- finished #220056 to be re-enabled.
+        disabledDueTo22807 = True
+
+        already_has_unfolding b = not disabledDueTo22807
+                                && -- The identifier has an unfolding, which we are going to serialise anyway
                                 hasCoreUnfolding (realIdUnfolding b)
                                 -- But not a stable unfolding, we want the optimised unfoldings.
                                 && not (isStableUnfolding (realIdUnfolding b))
@@ -771,7 +775,10 @@ is that these NOINLINE'd functions now can't be profitably inlined
 outside of the hs-boot loop.
 
 Note [Interface File with Core: Sharing RHSs]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+IMPORTANT: This optimisation is currently disabled due to #22027, it can be
+           re-enabled once #220056 is implemented.
 
 In order to avoid duplicating definitions for bindings which already have unfoldings
 we do some minor headstands to avoid serialising the RHS of a definition if it has


=====================================
compiler/GHC/IfaceToCore.hs
=====================================
@@ -933,7 +933,13 @@ tc_iface_bindings (IfaceRec bs) = do
 
 -- | See Note [Interface File with Core: Sharing RHSs]
 tc_iface_binding :: Id -> IfaceMaybeRhs -> IfL CoreExpr
-tc_iface_binding i IfUseUnfoldingRhs = return (unfoldingTemplate $ realIdUnfolding i)
+tc_iface_binding i IfUseUnfoldingRhs =
+  case maybeUnfoldingTemplate $ realIdUnfolding i of
+    Just e -> return e
+    Nothing -> pprPanic "tc_iface_binding" (vcat [text "Binding" <+> quotes (ppr i) <+> text "had an unfolding when the interface file was created"
+                                                 , text "which has now gone missing, something has badly gone wrong."
+                                                 , text "Unfolding:" <+> ppr (realIdUnfolding i)])
+
 tc_iface_binding _ (IfRhs rhs) = tcIfaceExpr rhs
 
 mk_top_id :: IfaceTopBndrInfo -> IfL Id


=====================================
compiler/GHC/JS/Make.hs
=====================================
@@ -129,7 +129,10 @@ module GHC.JS.Make
   -- * Miscellaneous
   -- $misc
   , allocData, allocClsA
+  , dataName
+  , clsName
   , dataFieldName, dataFieldNames
+  , varName, varNames
   )
 where
 
@@ -646,7 +649,7 @@ nFieldCache  = 16384
 
 dataFieldName :: Int -> FastString
 dataFieldName i
-  | i < 1 || i > nFieldCache = panic "dataFieldName" (ppr i)
+  | i < 0 || i > nFieldCache = panic "dataFieldName" (ppr i)
   | otherwise                = dataFieldCache ! i
 
 dataFieldNames :: [FastString]
@@ -657,6 +660,11 @@ dataFieldNames = fmap dataFieldName [1..nFieldCache]
 dataCache :: Array Int FastString
 dataCache = listArray (0,1024) (map (mkFastString . ("h$d"++) . show) [(0::Int)..1024])
 
+dataName :: Int -> FastString
+dataName i
+  | i < 0 || i > 1024 = panic "dataCacheName" (ppr i)
+  | otherwise         = dataCache ! i
+
 allocData :: Int -> JExpr
 allocData i = toJExpr (TxtI (dataCache ! i))
 
@@ -664,9 +672,26 @@ allocData i = toJExpr (TxtI (dataCache ! i))
 clsCache :: Array Int FastString
 clsCache = listArray (0,1024) (map (mkFastString . ("h$c"++) . show) [(0::Int)..1024])
 
+clsName :: Int -> FastString
+clsName i
+  | i < 0 || i > 1024 = panic "clsCacheName" (ppr i)
+  | otherwise         = clsCache ! i
+
 allocClsA :: Int -> JExpr
 allocClsA i = toJExpr (TxtI (clsCache ! i))
 
+-- | Cache "xXXX" names
+varCache :: Array Int FastString
+varCache = listArray (0,1024) (map (mkFastString . ('x':) . show) [(0::Int)..1024])
+
+varName :: Int -> Ident
+varName i
+  | i < 0 || i > 1024 = panic "varCacheName" (ppr i)
+  | otherwise         = TxtI $ varCache ! i
+
+varNames :: [Ident]
+varNames = fmap varName [1..1024]
+
 
 --------------------------------------------------------------------------------
 -- New Identifiers


=====================================
compiler/GHC/StgToJS/Rts/Rts.hs
=====================================
@@ -81,35 +81,7 @@ resetResultVar r = toJExpr r |= null_
 -- JIT can optimize better.
 closureConstructors :: StgToJSConfig -> JStat
 closureConstructors s = BlockStat
-  [ declClsConstr "h$c" ["f"] $ Closure
-      { clEntry  = var "f"
-      , clField1 = null_
-      , clField2 = null_
-      , clMeta   = 0
-      , clCC     = ccVal
-      }
-  , declClsConstr "h$c0" ["f"] $ Closure
-      { clEntry  = var "f"
-      , clField1 = null_
-      , clField2 = null_
-      , clMeta   = 0
-      , clCC     = ccVal
-      }
-  , declClsConstr "h$c1" ["f", "x1"] $ Closure
-      { clEntry  = var "f"
-      , clField1 = var "x1"
-      , clField2 = null_
-      , clMeta   = 0
-      , clCC     = ccVal
-      }
-  , declClsConstr "h$c2" ["f", "x1", "x2"] $ Closure
-      { clEntry  = var "f"
-      , clField1 = var "x1"
-      , clField2 = var "x2"
-      , clMeta   = 0
-      , clCC     = ccVal
-      }
-  , mconcat (map mkClosureCon [3..24])
+  [ mconcat (map mkClosureCon (Nothing : map Just [0..24]))
   , mconcat (map mkDataFill [1..24])
   ]
   where
@@ -118,19 +90,8 @@ closureConstructors s = BlockStat
       -- the cc argument happens to be named just like the cc field...
       | prof      = ([TxtI closureCC_], Just (var closureCC_))
       | otherwise = ([], Nothing)
-    addCCArg as = map TxtI as ++ ccArg
     addCCArg' as = as ++ ccArg
 
-    declClsConstr i as cl = TxtI i ||= ValExpr (JFunc (addCCArg as)
-      ( jVar $ \x ->
-          [ checkC
-          , x |= newClosure cl
-          , notifyAlloc x
-          , traceAlloc x
-          , returnS x
-          ]
-         ))
-
     traceAlloc x | csTraceRts s = appS "h$traceAlloc" [x]
                  | otherwise    = mempty
 
@@ -172,26 +133,36 @@ closureConstructors s = BlockStat
 
            | otherwise = mempty
 
-    mkClosureCon :: Int -> JStat
-    mkClosureCon n = funName ||= toJExpr fun
+    mkClosureCon :: Maybe Int -> JStat
+    mkClosureCon n0 = funName ||= toJExpr fun
       where
-        funName = TxtI $ mkFastString ("h$c" ++ show n)
+        n | Just n' <- n0 = n'
+          | Nothing <- n0 = 0
+        funName | Just n' <- n0 = TxtI $ clsName n'
+                | Nothing <- n0 = TxtI $ mkFastString "h$c"
         -- args are: f x1 x2 .. xn [cc]
-        args   = TxtI "f" : addCCArg' (map (TxtI . mkFastString . ('x':) . show) [(1::Int)..n])
+        args   = TxtI "f" : addCCArg' (take n varNames)
         fun    = JFunc args funBod
         -- x1 goes into closureField1. All the other args are bundled into an
         -- object in closureField2: { d1 = x2, d2 = x3, ... }
         --
-        extra_args = ValExpr . JHash . listToUniqMap $ zip
-                   (map (mkFastString . ('d':) . show) [(1::Int)..])
-                   (map (toJExpr . TxtI . mkFastString . ('x':) . show) [2..n])
+        vars   = map toJExpr $ take n varNames
+
+        x1     = case vars of
+                   []  -> null_
+                   x:_ -> x
+        x2     = case vars of
+                   []     -> null_
+                   [_]    -> null_
+                   [_,x]  -> x
+                   _:x:xs -> ValExpr . JHash . listToUniqMap $ zip dataFieldNames (x:xs)
 
         funBod = jVar $ \x ->
             [ checkC
             , x |= newClosure Closure
                { clEntry  = var "f"
-               , clField1 = var "x1"
-               , clField2 = extra_args
+               , clField1 = x1
+               , clField2 = x2
                , clMeta   = 0
                , clCC     = ccVal
                }
@@ -203,10 +174,9 @@ closureConstructors s = BlockStat
     mkDataFill :: Int -> JStat
     mkDataFill n = funName ||= toJExpr fun
       where
-        funName    = TxtI $ mkFastString ("h$d" ++ show n)
-        ds         = map (mkFastString . ('d':) . show) [(1::Int)..n]
-        extra_args = ValExpr . JHash . listToUniqMap . zip ds $ map (toJExpr . TxtI) ds
-        fun        = JFunc (map TxtI ds) (checkD <> returnS extra_args)
+        funName    = TxtI $ dataName n
+        extra_args = ValExpr . JHash . listToUniqMap . zip dataFieldNames $ map (toJExpr . TxtI) dataFieldNames
+        fun        = JFunc (map TxtI dataFieldNames) (checkD <> returnS extra_args)
 
 -- | JS Payload to perform stack manipulation in the RTS
 stackManip :: JStat
@@ -215,7 +185,7 @@ stackManip = mconcat (map mkPush [1..32]) <>
   where
     mkPush :: Int -> JStat
     mkPush n = let funName = TxtI $ mkFastString ("h$p" ++ show n)
-                   as      = map (TxtI . mkFastString . ('x':) . show) [1..n]
+                   as      = take n varNames
                    fun     = JFunc as ((sp |= sp + toJExpr n)
                                        <> mconcat (zipWith (\i a -> stack .! (sp - toJExpr (n-i)) |= toJExpr a)
                                                    [1..] as))
@@ -228,7 +198,7 @@ stackManip = mconcat (map mkPush [1..32]) <>
                       bits    = bitsIdx sig
                       n       = length bits
                       h       = last bits
-                      args    = map (TxtI . mkFastString . ('x':) . show) [1..n]
+                      args    = take n varNames
                       fun     = JFunc args $
                         mconcat [ sp |= sp + toJExpr (h+1)
                                 , mconcat (zipWith (\b a -> stack .! (sp - toJExpr (h-b)) |= toJExpr a) bits args)
@@ -288,7 +258,7 @@ loadRegs :: JStat
 loadRegs = mconcat $ map mkLoad [1..32]
   where
     mkLoad :: Int -> JStat
-    mkLoad n = let args   = map (TxtI . mkFastString . ("x"++) . show) [1..n]
+    mkLoad n = let args   = take n varNames
                    assign = zipWith (\a r -> toJExpr r |= toJExpr a)
                               args (reverse $ take n regsFromR1)
                    fname  = TxtI $ mkFastString ("h$l" ++ show n)


=====================================
compiler/ghc.cabal.in
=====================================
@@ -557,6 +557,7 @@ Library
         GHC.Platform.ARM
         GHC.Platform.AArch64
         GHC.Platform.Constants
+        GHC.Platform.LoongArch64
         GHC.Platform.NoRegs
         GHC.Platform.PPC
         GHC.Platform.Profile
@@ -564,7 +565,6 @@ Library
         GHC.Platform.Reg.Class
         GHC.Platform.Regs
         GHC.Platform.RISCV64
-        GHC.Platform.LoongArch64
         GHC.Platform.S390X
         GHC.Platform.Wasm32
         GHC.Platform.Ways


=====================================
docs/users_guide/9.6.1-notes.rst
=====================================
@@ -90,6 +90,22 @@ Language
 Compiler
 ~~~~~~~~
 
+- The `WebAssembly backend
+  <https://www.tweag.io/blog/2022-11-22-wasm-backend-merged-in-ghc>`_
+  has been merged. This allows GHC to be built as a cross-compiler
+  that targets ``wasm32-wasi`` and compiles Haskell code to
+  self-contained WebAssembly modules that can be executed on a variety
+  of different runtimes. There are a few caveats to be aware of:
+  
+  - To use the WebAssembly backend, one would need to follow the
+    instructions on `ghc-wasm-meta
+    <https://gitlab.haskell.org/ghc/ghc-wasm-meta>`_. The WebAssembly
+    backend is not included in the GHC release bindists for the time
+    being, nor is it supported by ``ghcup`` or ``stack`` yet.
+  - The WebAssembly backend is still under active development. It's
+    presented in this GHC version as a technology preview, bugs and
+    missing features are expected.
+
 - The :extension:`TypeInType` is now marked as deprecated. Its meaning has been included
   in :extension:`PolyKinds` and :extension:`DataKinds`.
 


=====================================
libraries/base/GHC/IO/Handle/Types.hs
=====================================
@@ -124,11 +124,11 @@ data Handle__
     Handle__ {
       haDevice      :: !dev,
       haType        :: HandleType,           -- type (read/write/append etc.)
-      haByteBuffer  :: !(IORef (Buffer Word8)), -- See [note Buffering Implementation]
+      haByteBuffer  :: !(IORef (Buffer Word8)), -- See Note [Buffering Implementation]
       haBufferMode  :: BufferMode,
       haLastDecode  :: !(IORef (dec_state, Buffer Word8)),
       -- ^ The byte buffer just  before we did our last batch of decoding.
-      haCharBuffer  :: !(IORef (Buffer CharBufElem)), -- See [note Buffering Implementation]
+      haCharBuffer  :: !(IORef (Buffer CharBufElem)), -- See Note [Buffering Implementation]
       haBuffers     :: !(IORef (BufferList CharBufElem)),  -- spare buffers
       haEncoder     :: Maybe (TextEncoder enc_state),
       haDecoder     :: Maybe (TextDecoder dec_state),
@@ -261,13 +261,13 @@ data BufferMode
             )
 
 {-
-[note Buffering Implementation]
-
+Note [Buffering Implementation]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Each Handle has two buffers: a byte buffer (haByteBuffer) and a Char
 buffer (haCharBuffer).
 
-[note Buffered Reading]
-
+Note [Buffered Reading]
+~~~~~~~~~~~~~~~~~~~~~~~
 For read Handles, bytes are read into the byte buffer, and immediately
 decoded into the Char buffer (see
 GHC.IO.Handle.Internals.readTextDevice).  The only way there might be
@@ -279,8 +279,8 @@ reading data into a Handle.  When reading, we can always just read all
 the data there is available without blocking, decode it into the Char
 buffer, and then provide it immediately to the caller.
 
-[note Buffered Writing]
-
+Note [Buffered Writing]
+~~~~~~~~~~~~~~~~~~~~~~~
 Characters are written into the Char buffer by e.g. hPutStr.  At the
 end of the operation, or when the char buffer is full, the buffer is
 decoded to the byte buffer (see writeCharBuffer).  This is so that we
@@ -288,8 +288,8 @@ can detect encoding errors at the right point.
 
 Hence, the Char buffer is always empty between Handle operations.
 
-[note Buffer Sizing]
-
+Note [Buffer Sizing]
+~~~~~~~~~~~~~~~~~~~~
 The char buffer is always a default size (dEFAULT_CHAR_BUFFER_SIZE).
 The byte buffer size is chosen by the underlying device (via its
 IODevice.newBuffer).  Hence the size of these buffers is not under
@@ -322,8 +322,8 @@ writeCharBuffer, which checks whether the buffer should be flushed
 according to the current buffering mode.  Additionally, we look for
 newlines and flush if the mode is LineBuffering.
 
-[note Buffer Flushing]
-
+Note [Buffer Flushing]
+~~~~~~~~~~~~~~~~~~~~~~
 ** Flushing the Char buffer
 
 We must be able to flush the Char buffer, in order to implement


=====================================
libraries/containers
=====================================
@@ -1 +1 @@
-Subproject commit fbafcf704f0febd9ddac84dbe00ae3787af43550
+Subproject commit 9f4a93604c66a5e605ce46fc30003b71802b3cfd


=====================================
libraries/ghci/GHCi/InfoTable.hsc
=====================================
@@ -228,6 +228,15 @@ mkJumpToAddr a = case hostPlatformArch of
                  , fromIntegral w64
                  , fromIntegral (w64 `shiftR` 32) ]
 
+    ArchLoongArch64 -> pure $
+        let w64 = fromIntegral (funPtrToInt a) :: Word64
+        in Right [ 0x1c00000c          -- pcaddu12i $t0,0
+                 , 0x28c0418c          -- ld.d      $t0,$t0,16
+                 , 0x4c000180          -- jr        $t0
+                 , 0x03400000          -- nop
+                 , fromIntegral w64
+                 , fromIntegral (w64 `shiftR` 32) ]
+
     arch ->
       -- The arch isn't supported. You either need to add your architecture as a
       -- distinct case, or use non-TABLES_NEXT_TO_CODE mode.


=====================================
m4/ghc_tables_next_to_code.m4
=====================================
@@ -17,7 +17,7 @@ AC_DEFUN([GHC_TABLES_NEXT_TO_CODE],
   case "$Unregisterised" in
       NO)
           case "$TargetArch" in
-              ia64|powerpc64|powerpc64le|s390x|wasm32|loongarch64)
+              ia64|powerpc64|powerpc64le|s390x|wasm32)
                   TablesNextToCodeDefault=NO
                   AC_MSG_RESULT([no])
                   ;;


=====================================
rts/posix/Ticker.c
=====================================
@@ -65,13 +65,17 @@
  * On Linux we can use timerfd_* (introduced in Linux
  * 2.6.25) and a thread instead of alarm signals. It avoids the risk of
  * interrupting syscalls (see #10840) and the risk of being accidentally
- * modified in user code using signals.
+ * modified in user code using signals. NetBSD has also added timerfd
+ * support since version 10.
+ *
+ * For older version of linux/netbsd without timerfd we fall back to the
+ * pthread based implementation.
  */
-#if defined(linux_HOST_OS) && HAVE_SYS_TIMERFD_H
-#define USE_PTHREAD_FOR_ITIMER
+#if HAVE_SYS_TIMERFD_H
+#define USE_TIMERFD_FOR_ITIMER
 #endif
 
-#if defined(freebsd_HOST_OS)
+#if defined(linux_HOST_OS)
 #define USE_PTHREAD_FOR_ITIMER
 #endif
 
@@ -79,6 +83,10 @@
 #define USE_PTHREAD_FOR_ITIMER
 #endif
 
+#if defined(freebsd_HOST_OS)
+#define USE_PTHREAD_FOR_ITIMER
+#endif
+
 #if defined(solaris2_HOST_OS)
 /* USE_TIMER_CREATE is usually disabled for Solaris. In fact it is
    supported well on this OS, but requires additional privilege. When
@@ -98,7 +106,9 @@ ghc-stage2: timer_create: Not owner
 #endif /* solaris2_HOST_OS */
 
 // Select the variant to use
-#if defined(USE_PTHREAD_FOR_ITIMER)
+#if defined(USE_TIMERFD_FOR_ITIMER)
+#include "ticker/TimerFd.c"
+#elif defined(USE_PTHREAD_FOR_ITIMER)
 #include "ticker/Pthread.c"
 #elif defined(USE_TIMER_CREATE)
 #include "ticker/TimerCreate.c"


=====================================
rts/posix/ticker/Pthread.c
=====================================
@@ -63,13 +63,6 @@
 #include <unistd.h>
 #include <fcntl.h>
 
-#if defined(HAVE_SYS_TIMERFD_H)
-#include <sys/timerfd.h>
-#define USE_TIMERFD_FOR_ITIMER 1
-#else
-#define USE_TIMERFD_FOR_ITIMER 0
-#endif
-
 /*
  * TFD_CLOEXEC has been added in Linux 2.6.26.
  * If it is not available, we use fcntl(F_SETFD).
@@ -93,61 +86,16 @@ static Condition start_cond;
 static Mutex mutex;
 static OSThreadId thread;
 
-// file descriptor for the timer (Linux only)
-static int timerfd = -1;
-
-// pipe for signaling exit
-static int pipefds[2];
-
 static void *itimer_thread_func(void *_handle_tick)
 {
     TickProc handle_tick = _handle_tick;
-    uint64_t nticks;
-    ssize_t r = 0;
-    struct pollfd pollfds[2];
-
-#if USE_TIMERFD_FOR_ITIMER
-    pollfds[0].fd = pipefds[0];
-    pollfds[0].events = POLLIN;
-    pollfds[1].fd = timerfd;
-    pollfds[1].events = POLLIN;
-#endif
 
     // Relaxed is sufficient: If we don't see that exited was set in one iteration we will
     // see it next time.
     TSAN_ANNOTATE_BENIGN_RACE(&exited, "itimer_thread_func");
     while (!RELAXED_LOAD(&exited)) {
-        if (USE_TIMERFD_FOR_ITIMER) {
-            if (poll(pollfds, 2, -1) == -1) {
-                sysErrorBelch("Ticker: poll failed: %s", strerror(errno));
-            }
-
-            // We check the pipe first, even though the timerfd may also have triggered.
-            if (pollfds[0].revents & POLLIN) {
-                // the pipe is ready for reading, the only possible reason is that we're exiting
-                exited = true; // set this again to make sure even RELAXED_LOAD will read the proper value
-                // no further action needed, skip ahead to handling the final tick and then stopping
-            }
-            else if (pollfds[1].revents & POLLIN) { // the timerfd is ready for reading
-                r = read(timerfd, &nticks, sizeof(nticks)); // this should never block now
-
-                if ((r == 0) && (errno == 0)) {
-                   /* r == 0 is expected only for non-blocking fd (in which case
-                    * errno should be EAGAIN) but we use a blocking fd.
-                    *
-                    * Due to a kernel bug (cf https://lkml.org/lkml/2019/8/16/335)
-                    * on some platforms we could see r == 0 and errno == 0.
-                    */
-                   IF_DEBUG(scheduler, debugBelch("read(timerfd) returned 0 with errno=0. This is a known kernel bug. We just ignore it."));
-                }
-                else if (r != sizeof(nticks) && errno != EINTR) {
-                   barf("Ticker: read(timerfd) failed with %s and returned %zd", strerror(errno), r);
-                }
-            }
-        } else {
-            if (rtsSleep(itimer_interval) != 0) {
-                sysErrorBelch("Ticker: sleep failed: %s", strerror(errno));
-            }
+        if (rtsSleep(itimer_interval) != 0) {
+            sysErrorBelch("Ticker: sleep failed: %s", strerror(errno));
         }
 
         // first try a cheap test
@@ -164,10 +112,6 @@ static void *itimer_thread_func(void *_handle_tick)
         }
     }
 
-    if (USE_TIMERFD_FOR_ITIMER) {
-        close(timerfd);
-    }
-
     return NULL;
 }
 
@@ -186,39 +130,6 @@ initTicker (Time interval, TickProc handle_tick)
     initCondition(&start_cond);
     initMutex(&mutex);
 
-    /* Open the file descriptor for the timer synchronously.
-     *
-     * We used to do it in itimer_thread_func (i.e. in the timer thread) but it
-     * meant that some user code could run before it and get confused by the
-     * allocation of the timerfd.
-     *
-     * See hClose002 which unsafely closes a file descriptor twice expecting an
-     * exception the second time: it sometimes failed when the second call to
-     * "close" closed our own timerfd which inadvertently reused the same file
-     * descriptor closed by the first call! (see #20618)
-     */
-#if USE_TIMERFD_FOR_ITIMER
-    struct itimerspec it;
-    it.it_value.tv_sec  = TimeToSeconds(itimer_interval);
-    it.it_value.tv_nsec = TimeToNS(itimer_interval) % 1000000000;
-    it.it_interval = it.it_value;
-
-    timerfd = timerfd_create(CLOCK_MONOTONIC, TFD_CLOEXEC);
-    if (timerfd == -1) {
-        barf("timerfd_create: %s", strerror(errno));
-    }
-    if (!TFD_CLOEXEC) {
-        fcntl(timerfd, F_SETFD, FD_CLOEXEC);
-    }
-    if (timerfd_settime(timerfd, 0, &it, NULL)) {
-        barf("timerfd_settime: %s", strerror(errno));
-    }
-
-    if (pipe(pipefds) < 0) {
-        barf("pipe: %s", strerror(errno));
-    }
-#endif
-
     /*
      * Create the thread with all blockable signals blocked, leaving signal
      * handling to the main and/or other threads.  This is especially useful in
@@ -269,21 +180,9 @@ exitTicker (bool wait)
 
     // wait for ticker to terminate if necessary
     if (wait) {
-#if USE_TIMERFD_FOR_ITIMER
-        // write anything to the pipe to trigger poll() in the ticker thread
-        if (write(pipefds[1], "stop", 5) < 0) {
-            sysErrorBelch("Ticker: Failed to write to pipe: %s", strerror(errno));
-        }
-#endif
         if (pthread_join(thread, NULL)) {
             sysErrorBelch("Ticker: Failed to join: %s", strerror(errno));
         }
-#if USE_TIMERFD_FOR_ITIMER
-        // These need to happen AFTER the ticker thread has finished to prevent a race condition
-        // where the ticker thread closes the read end of the pipe before we're done writing to it.
-        close(pipefds[0]);
-        close(pipefds[1]);
-#endif
         closeMutex(&mutex);
         closeCondition(&start_cond);
     } else {


=====================================
rts/posix/ticker/TimerFd.c
=====================================
@@ -0,0 +1,280 @@
+/* -----------------------------------------------------------------------------
+ *
+ * (c) The GHC Team, 1995-2023
+ *
+ * Interval timer for profiling and pre-emptive scheduling.
+ *
+ * ---------------------------------------------------------------------------*/
+
+/*
+ * We use a realtime timer by default.  I found this much more
+ * reliable than a CPU timer:
+ *
+ * Experiments with different frequencies: using
+ * CLOCK_REALTIME/CLOCK_MONOTONIC on Linux 2.6.32,
+ *     1000us has  <1% impact on runtime
+ *      100us has  ~2% impact on runtime
+ *       10us has ~40% impact on runtime
+ *
+ * using CLOCK_PROCESS_CPUTIME_ID on Linux 2.6.32,
+ *     I cannot get it to tick faster than 10ms (10000us)
+ *     which isn't great for profiling.
+ *
+ * In the threaded RTS, we can't tick in CPU time because the thread
+ * which has the virtual timer might be idle, so the tick would never
+ * fire.  Therefore we used to tick in realtime in the threaded RTS and
+ * in CPU time otherwise, but now we always tick in realtime, for
+ * several reasons:
+ *
+ *   - resolution (see above)
+ *   - consistency (-threaded is the same as normal)
+ *   - more consistency: Windows only has a realtime timer
+ *
+ * Note we want to use CLOCK_MONOTONIC rather than CLOCK_REALTIME,
+ * because the latter may jump around (NTP adjustments, leap seconds
+ * etc.).
+ */
+
+#include "rts/PosixSource.h"
+#include "Rts.h"
+
+#include "Ticker.h"
+#include "RtsUtils.h"
+#include "Proftimer.h"
+#include "Schedule.h"
+#include "posix/Clock.h"
+#include <sys/poll.h>
+
+#include <time.h>
+#if HAVE_SYS_TIME_H
+# include <sys/time.h>
+#endif
+
+#if defined(HAVE_SIGNAL_H)
+# include <signal.h>
+#endif
+
+#include <string.h>
+
+#include <pthread.h>
+#if defined(HAVE_PTHREAD_NP_H)
+#include <pthread_np.h>
+#endif
+#include <unistd.h>
+#include <fcntl.h>
+
+#include <sys/timerfd.h>
+
+
+/*
+ * TFD_CLOEXEC has been added in Linux 2.6.26.
+ * If it is not available, we use fcntl(F_SETFD).
+ */
+#if !defined(TFD_CLOEXEC)
+#define TFD_CLOEXEC 0
+#endif
+
+static Time itimer_interval = DEFAULT_TICK_INTERVAL;
+
+// Should we be firing ticks?
+// Writers to this must hold the mutex below.
+static bool stopped = false;
+
+// should the ticker thread exit?
+// This can be set without holding the mutex.
+static bool exited = true;
+
+// Signaled when we want to (re)start the timer
+static Condition start_cond;
+static Mutex mutex;
+static OSThreadId thread;
+
+// file descriptor for the timer (Linux only)
+static int timerfd = -1;
+
+// pipe for signaling exit
+static int pipefds[2];
+
+static void *itimer_thread_func(void *_handle_tick)
+{
+    TickProc handle_tick = _handle_tick;
+    uint64_t nticks;
+    ssize_t r = 0;
+    struct pollfd pollfds[2];
+
+    pollfds[0].fd = pipefds[0];
+    pollfds[0].events = POLLIN;
+    pollfds[1].fd = timerfd;
+    pollfds[1].events = POLLIN;
+
+    // Relaxed is sufficient: If we don't see that exited was set in one iteration we will
+    // see it next time.
+    TSAN_ANNOTATE_BENIGN_RACE(&exited, "itimer_thread_func");
+    while (!RELAXED_LOAD(&exited)) {
+        if (poll(pollfds, 2, -1) == -1) {
+            sysErrorBelch("Ticker: poll failed: %s", strerror(errno));
+        }
+
+        // We check the pipe first, even though the timerfd may also have triggered.
+        if (pollfds[0].revents & POLLIN) {
+            // the pipe is ready for reading, the only possible reason is that we're exiting
+            exited = true; // set this again to make sure even RELAXED_LOAD will read the proper value
+            // no further action needed, skip ahead to handling the final tick and then stopping
+        }
+        else if (pollfds[1].revents & POLLIN) { // the timerfd is ready for reading
+            r = read(timerfd, &nticks, sizeof(nticks)); // this should never block now
+
+            if ((r == 0) && (errno == 0)) {
+               /* r == 0 is expected only for non-blocking fd (in which case
+                * errno should be EAGAIN) but we use a blocking fd.
+                *
+                * Due to a kernel bug (cf https://lkml.org/lkml/2019/8/16/335)
+                * on some platforms we could see r == 0 and errno == 0.
+                */
+               IF_DEBUG(scheduler, debugBelch("read(timerfd) returned 0 with errno=0. This is a known kernel bug. We just ignore it."));
+            }
+            else if (r != sizeof(nticks) && errno != EINTR) {
+               barf("Ticker: read(timerfd) failed with %s and returned %zd", strerror(errno), r);
+            }
+        }
+
+        // first try a cheap test
+        TSAN_ANNOTATE_BENIGN_RACE(&stopped, "itimer_thread_func");
+        if (RELAXED_LOAD(&stopped)) {
+            OS_ACQUIRE_LOCK(&mutex);
+            // should we really stop?
+            if (stopped) {
+                waitCondition(&start_cond, &mutex);
+            }
+            OS_RELEASE_LOCK(&mutex);
+        } else {
+            handle_tick(0);
+        }
+    }
+
+    close(timerfd);
+    return NULL;
+}
+
+void
+initTicker (Time interval, TickProc handle_tick)
+{
+    itimer_interval = interval;
+    stopped = true;
+    exited = false;
+#if defined(HAVE_SIGNAL_H)
+    sigset_t mask, omask;
+    int sigret;
+#endif
+    int ret;
+
+    initCondition(&start_cond);
+    initMutex(&mutex);
+
+    /* Open the file descriptor for the timer synchronously.
+     *
+     * We used to do it in itimer_thread_func (i.e. in the timer thread) but it
+     * meant that some user code could run before it and get confused by the
+     * allocation of the timerfd.
+     *
+     * See hClose002 which unsafely closes a file descriptor twice expecting an
+     * exception the second time: it sometimes failed when the second call to
+     * "close" closed our own timerfd which inadvertently reused the same file
+     * descriptor closed by the first call! (see #20618)
+     */
+    struct itimerspec it;
+    it.it_value.tv_sec  = TimeToSeconds(itimer_interval);
+    it.it_value.tv_nsec = TimeToNS(itimer_interval) % 1000000000;
+    it.it_interval = it.it_value;
+
+    timerfd = timerfd_create(CLOCK_MONOTONIC, TFD_CLOEXEC);
+    if (timerfd == -1) {
+        barf("timerfd_create: %s", strerror(errno));
+    }
+    if (!TFD_CLOEXEC) {
+        fcntl(timerfd, F_SETFD, FD_CLOEXEC);
+    }
+    if (timerfd_settime(timerfd, 0, &it, NULL)) {
+        barf("timerfd_settime: %s", strerror(errno));
+    }
+
+    if (pipe(pipefds) < 0) {
+        barf("pipe: %s", strerror(errno));
+    }
+
+    /*
+     * Create the thread with all blockable signals blocked, leaving signal
+     * handling to the main and/or other threads.  This is especially useful in
+     * the non-threaded runtime, where applications might expect sigprocmask(2)
+     * to effectively block signals.
+     */
+#if defined(HAVE_SIGNAL_H)
+    sigfillset(&mask);
+    sigret = pthread_sigmask(SIG_SETMASK, &mask, &omask);
+#endif
+    ret = createAttachedOSThread(&thread, "ghc_ticker", itimer_thread_func, (void*)handle_tick);
+#if defined(HAVE_SIGNAL_H)
+    if (sigret == 0)
+        pthread_sigmask(SIG_SETMASK, &omask, NULL);
+#endif
+
+    if (ret != 0) {
+        barf("Ticker: Failed to spawn thread: %s", strerror(errno));
+    }
+}
+
+void
+startTicker(void)
+{
+    OS_ACQUIRE_LOCK(&mutex);
+    RELAXED_STORE(&stopped, false);
+    signalCondition(&start_cond);
+    OS_RELEASE_LOCK(&mutex);
+}
+
+/* There may be at most one additional tick fired after a call to this */
+void
+stopTicker(void)
+{
+    OS_ACQUIRE_LOCK(&mutex);
+    RELAXED_STORE(&stopped, true);
+    OS_RELEASE_LOCK(&mutex);
+}
+
+/* There may be at most one additional tick fired after a call to this */
+void
+exitTicker (bool wait)
+{
+    ASSERT(!SEQ_CST_LOAD(&exited));
+    SEQ_CST_STORE(&exited, true);
+    // ensure that ticker wakes up if stopped
+    startTicker();
+
+    // wait for ticker to terminate if necessary
+    if (wait) {
+        // write anything to the pipe to trigger poll() in the ticker thread
+        if (write(pipefds[1], "stop", 5) < 0) {
+            sysErrorBelch("Ticker: Failed to write to pipe: %s", strerror(errno));
+        }
+
+        if (pthread_join(thread, NULL)) {
+            sysErrorBelch("Ticker: Failed to join: %s", strerror(errno));
+        }
+
+        // These need to happen AFTER the ticker thread has finished to prevent a race condition
+        // where the ticker thread closes the read end of the pipe before we're done writing to it.
+        close(pipefds[0]);
+        close(pipefds[1]);
+
+        closeMutex(&mutex);
+        closeCondition(&start_cond);
+    } else {
+        pthread_detach(thread);
+    }
+}
+
+int
+rtsTimerSignal(void)
+{
+    return SIGALRM;
+}


=====================================
testsuite/tests/driver/fat-iface/Makefile
=====================================
@@ -49,4 +49,11 @@ fat010: clean
 	echo >> "THB.hs"
 	"$(TEST_HC)" $(TEST_HC_OPTS) THC.hs -fhide-source-paths -fwrite-if-simplified-core -fprefer-byte-code
 
+T22807: clean
+	"$(TEST_HC)" $(TEST_HC_OPTS) T22807A.hs -fhide-source-paths -fwrite-if-simplified-core -fbyte-code-and-object-code -fno-omit-interface-pragmas -fprefer-byte-code
+	"$(TEST_HC)" $(TEST_HC_OPTS) T22807B.hs -fhide-source-paths -fwrite-if-simplified-core -fprefer-byte-code -fbyte-code-and-object-code -fno-omit-interface-pragmas
+
+T22807_ghci: clean
+	"$(TEST_HC)" $(TEST_HC_OPTS) T22807_ghci.hs -fno-full-laziness -fhide-source-paths -fwrite-if-simplified-core -O2 -dynamic -v0
+	"$(TEST_HC)" $(TEST_HC_OPTS) -v0 --interactive -fhide-source-paths -fno-full-laziness < T22807_ghci.script
 


=====================================
testsuite/tests/driver/fat-iface/T22807.stdout
=====================================
@@ -0,0 +1,2 @@
+[1 of 1] Compiling T22807A
+[2 of 2] Compiling T22807B


=====================================
testsuite/tests/driver/fat-iface/T22807A.hs
=====================================
@@ -0,0 +1,6 @@
+module T22807A where
+
+xs :: [a]
+xs = []
+
+


=====================================
testsuite/tests/driver/fat-iface/T22807B.hs
=====================================
@@ -0,0 +1,5 @@
+{-# LANGUAGE TemplateHaskell #-}
+module T22807B where
+import T22807A
+
+$(pure xs)


=====================================
testsuite/tests/driver/fat-iface/T22807_ghci.hs
=====================================
@@ -0,0 +1,8 @@
+module T22807_ghci where
+
+
+foo b =
+    let x = Just [1..1000]
+    in if b
+        then Left x
+        else Right x


=====================================
testsuite/tests/driver/fat-iface/T22807_ghci.script
=====================================
@@ -0,0 +1,6 @@
+:l T22807_ghci.hs
+
+import T22807_ghci
+import Data.Either
+
+isLeft (foo True)


=====================================
testsuite/tests/driver/fat-iface/T22807_ghci.stdout
=====================================
@@ -0,0 +1 @@
+True


=====================================
testsuite/tests/driver/fat-iface/all.T
=====================================
@@ -15,5 +15,9 @@ test('fat013', [req_th, extra_files(['FatTH.hs', 'FatQuote.hs'])], multimod_comp
 # When using interpreter should not produce objects
 test('fat014', [req_th, extra_files(['FatTH.hs', 'FatQuote.hs']), extra_run_opts('-fno-code')], ghci_script, ['fat014.script'])
 test('fat015', [req_th, unless(ghc_dynamic(), skip), extra_files(['FatQuote.hs', 'FatQuote1.hs', 'FatQuote2.hs', 'FatTH1.hs', 'FatTH2.hs', 'FatTHTop.hs'])], multimod_compile, ['FatTHTop', '-fno-code -fwrite-interface'])
+test('T22807', [req_th, unless(ghc_dynamic(), skip), extra_files(['T22807A.hs', 'T22807B.hs'])]
+             , makefile_test, ['T22807'])
+test('T22807_ghci', [req_th, unless(ghc_dynamic(), skip), extra_files(['T22807_ghci.hs'])]
+             , makefile_test, ['T22807_ghci'])
 
 



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/bc14c389743bcf0f58cc3d73370d40738ab92181...ad3bfdb7683d68061f04f08a70ab20c4578a490d

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/bc14c389743bcf0f58cc3d73370d40738ab92181...ad3bfdb7683d68061f04f08a70ab20c4578a490d
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20230203/df6973cc/attachment-0001.html>


More information about the ghc-commits mailing list