[Git][ghc/ghc][wip/andreask/weakly_pinned] Add functions to check for weakly pinned arrays.
Andreas Klebinger (@AndreasK)
gitlab at gitlab.haskell.org
Wed Aug 21 12:52:34 UTC 2024
Andreas Klebinger pushed to branch wip/andreask/weakly_pinned at Glasgow Haskell Compiler / GHC
Commits:
dbfb046e by Andreas Klebinger at 2024-08-21T14:32:56+02:00
Add functions to check for weakly pinned arrays.
This commit adds `isByteArrayWeaklyPinned#` and `isMutableByteArrayWeaklyPinned#` primops.
These check if a bytearray is *weakly* pinned. Which means it can still be explicitly moved
by the user via compaction but won't be moved by the RTS.
This moves us one more stop closer to nailing down #22255.
- - - - -
10 changed files:
- compiler/GHC/Builtin/primops.txt.pp
- compiler/GHC/StgToCmm/Prim.hs
- compiler/GHC/StgToJS/Prim.hs
- docs/users_guide/exts/ffi.rst
- libraries/base/src/GHC/Base.hs
- libraries/base/src/GHC/Exts.hs
- rts/PrimOps.cmm
- rts/RtsSymbols.c
- rts/include/stg/MiscClosures.h
- testsuite/tests/rts/T13894.hs
Changes:
=====================================
compiler/GHC/Builtin/primops.txt.pp
=====================================
@@ -1925,7 +1925,25 @@ primop MutableByteArrayIsPinnedOp "isMutableByteArrayPinned#" GenPrimOp
primop ByteArrayIsPinnedOp "isByteArrayPinned#" GenPrimOp
ByteArray# -> Int#
- {Determine whether a 'ByteArray#' is guaranteed not to move during GC.}
+ {Determine whether a 'ByteArray#' is guaranteed not to move.}
+ with out_of_line = True
+
+primop ByteArrayIsWeaklyPinnedOp "isByteArrayWeaklyPinned#" GenPrimOp
+ ByteArray# -> Int#
+ {Similar to 'isByteArrayPinned#'. Weakly pinned byte arrays are allowed
+ to be copied into compact regions by the user, potentially invalidating
+ the results of earlier calls to 'byteArrayContents#'.
+
+ See the section `Pinned Byte Arrays` in the user guide for more information.
+
+ This function also returns true for regular pinned bytearrays.
+ }
+ with out_of_line = True
+
+primop MutableByteArrayIsWeaklyPinnedOp "isMutableByteArrayWeaklyPinned#" GenPrimOp
+ ByteArray# -> Int#
+ { 'isByteArrayWeaklyPinned#' but for mutable arrays.
+ }
with out_of_line = True
primop ByteArrayContents_Char "byteArrayContents#" GenPrimOp
=====================================
compiler/GHC/StgToCmm/Prim.hs
=====================================
@@ -1668,10 +1668,12 @@ emitPrimOp cfg primop =
NewPinnedByteArrayOp_Char -> alwaysExternal
NewAlignedPinnedByteArrayOp_Char -> alwaysExternal
MutableByteArrayIsPinnedOp -> alwaysExternal
+ MutableByteArrayIsWeaklyPinnedOp -> alwaysExternal
DoubleDecode_2IntOp -> alwaysExternal
DoubleDecode_Int64Op -> alwaysExternal
FloatDecode_IntOp -> alwaysExternal
ByteArrayIsPinnedOp -> alwaysExternal
+ ByteArrayIsWeaklyPinnedOp -> alwaysExternal
ShrinkMutableByteArrayOp_Char -> alwaysExternal
ResizeMutableByteArrayOp_Char -> alwaysExternal
ShrinkSmallMutableArrayOp_Char -> alwaysExternal
=====================================
compiler/GHC/StgToJS/Prim.hs
=====================================
@@ -670,6 +670,8 @@ genPrim prof bound ty op = case op of
NewAlignedPinnedByteArrayOp_Char -> \[r] [l,_align] -> pure $ PrimInline (newByteArray r l)
MutableByteArrayIsPinnedOp -> \[r] [_] -> pure $ PrimInline $ r |= one_
ByteArrayIsPinnedOp -> \[r] [_] -> pure $ PrimInline $ r |= one_
+ ByteArrayIsWeaklyPinnedOp -> \[r] [_] -> pure $ PrimInline $ r |= one_
+ MutableByteArrayIsWeaklyPinnedOp -> \[r] [_] -> pure $ PrimInline $ r |= one_
ByteArrayContents_Char -> \[a,o] [b] -> pure $ PrimInline $ mconcat [a |= b, o |= zero_]
MutableByteArrayContents_Char -> \[a,o] [b] -> pure $ PrimInline $ mconcat [a |= b, o |= zero_]
ShrinkMutableByteArrayOp_Char -> \[] [a,n] -> pure $ PrimInline $ appS hdShrinkMutableByteArrayStr [a,n]
=====================================
docs/users_guide/exts/ffi.rst
=====================================
@@ -1114,21 +1114,66 @@ Pinned Byte Arrays
A pinned byte array is one that the garbage collector is not allowed
to move. Consequently, it has a stable address that can be safely
-requested with ``byteArrayContents#``. Not that being pinned doesn't
-prevent the byteArray from being gc'ed in the same fashion a regular
-byte array would be.
+requested with ``byteArrayContents#``. As long as the array remains live
+the address returned by ``byteArrayContents#`` will remain valid. Note that
+being pinned doesn't prevent the byteArray from being gc'ed in the same fashion
+a regular byte array would be if there are no more references to the ``ByteArray#``.
There are a handful of primitive functions in :base-ref:`GHC.Exts.`
used to enforce or check for pinnedness: ``isByteArrayPinned#``,
-``isMutableByteArrayPinned#``, and ``newPinnedByteArray#``. A
-byte array can be pinned as a result of three possible causes:
+``isMutableByteArrayPinned#``, ``isByteArrayWeaklyPinned#``,
+``isMutableByteArrayWeaklyPinned#``, and ``newPinnedByteArray#``. A
+byte array can be pinned or weakly pinned as a result of three possible causes:
-1. It was allocated by ``newPinnedByteArray#``.
-2. It is large. Currently, GHC defines large object to be one
+1. It was allocated by ``newPinnedByteArray#``. This results in a regular pinned byte array.
+2. It is large, this results in a weakly pinned byte array. Currently, GHC defines large object to be one
that is at least as large as 80% of a 4KB block (i.e. at
least 3277 bytes).
-3. It has been copied into a compact region. The documentation
+3. It has been copied into a compact region, resulting in a weakly pinned array. The documentation
for ``ghc-compact`` and ``compact`` describes this process.
+The difference between a pinned array and a weakly pinned array is simply that
+trying to compact a pinned array will result in an exception. Trying to compact
+a weakly pinned array will succeeded. However result of earlier
+calls to ``byteArrayContents#`` is not updated during compaction, which means
+these results will still point to the address where the array was located originally,
+and not to the new address inside the compact region.
+
+This is particularly dangerous when an address to a byte arrays content is stored
+inside a datastructure along with a reference to the byte array.
+If the data structure is compacted later on the pointer won't be updated but the
+reference to the byte array will point to a copy inside the compact region.
+A common data type susceptible to this is `ForeignPtr` when used to represent a ByteArray#.
+
+Here is an example to illustrate this:
+
+.. code-block:: haskell
+ workWithArrayContents :: (ByteArray, Ptr Word8) -> (Ptr Word8 -> IO ()) -> IO ()
+ workWithArrayContents (arr@(ByteArray uarr),ptr) worker =
+ case () of
+ _
+ -- Conservative but safe
+ | isByteArrayPinned arr -> keepAliveUnlifted uarr (worker ptr)
+ -- Potentially dangerous, the program needs to ensures the Ptr points into the array.
+ | isByteArrayWeaklyPinned arr -> keepAliveUnlifted uarr (worker ptr)
+ | otherwise -> ... -- Otherwise we can't directly use it for safe FFI calls directly at all.
+
+ main :: IO ()
+ main = do
+ -- We create a large array, which causes it to be implicitly pinned
+ arr <- newByteArray 5000
+ arr@(ByteArray uarr) <- freezeByteArray arr 0 5000 -- Make it immutable
+ let ptr = byteArrayContents arr
+
+ -- Compacting a data structure that contains both an array and a ptr to
+ -- the arrays content's is dangerous and usually the wrong thing to do.
+ let foo = (arr, ptr)
+ foo_compacted <- compact foo
+
+ -- This is fine
+ workWithArrayContents foo do_work
+ -- This is unsound
+ workWithArrayContents (getCompact foo_compacted) do_work
+
.. [1] Prior to GHC 8.10, when passing an ``ArrayArray#`` argument
to a foreign function, the foreign function would see a pointer
to the ``StgMutArrPtrs`` rather than just the payload.
=====================================
libraries/base/src/GHC/Base.hs
=====================================
@@ -139,7 +139,8 @@ module GHC.Base
) where
import GHC.Internal.Base
-import GHC.Prim hiding (dataToTagLarge#, dataToTagSmall#, whereFrom#)
+import GHC.Prim hiding (dataToTagLarge#, dataToTagSmall#, whereFrom#,
+ isByteArrayWeaklyPinned#, isMutableByteArrayWeaklyPinned#)
-- Hide dataToTagLarge# because it is expected to break for
-- GHC-internal reasons in the near future, and shouldn't
-- be exposed from base (not even GHC.Exts)
=====================================
libraries/base/src/GHC/Exts.hs
=====================================
@@ -112,7 +112,9 @@ module GHC.Exts
import GHC.Internal.Exts
import GHC.Internal.ArrayArray
-import GHC.Prim hiding ( coerce, dataToTagSmall#, dataToTagLarge#, whereFrom# )
+import GHC.Prim hiding (
+ coerce, dataToTagSmall#, dataToTagLarge#, whereFrom#,
+ isByteArrayWeaklyPinned#, isMutableByteArrayWeaklyPinned# )
-- Hide dataToTag# ops because they are expected to break for
-- GHC-internal reasons in the near future, and shouldn't
-- be exposed from base (not even GHC.Exts)
=====================================
rts/PrimOps.cmm
=====================================
@@ -215,12 +215,29 @@ stg_isByteArrayPinnedzh ( gcptr ba )
return (flags & BF_PINNED != 0);
}
+stg_isByteArrayWeaklyPinnedzh ( gcptr ba )
+// ByteArray# s -> Int#
+{
+ W_ bd, flags;
+ bd = Bdescr(ba);
+ // See #22255 and the primop docs.
+ flags = TO_W_(bdescr_flags(bd));
+
+ return (flags & (BF_PINNED | BF_COMPACT | BF_LARGE) != 0);
+}
+
stg_isMutableByteArrayPinnedzh ( gcptr mba )
// MutableByteArray# s -> Int#
{
jump stg_isByteArrayPinnedzh(mba);
}
+stg_isMutableByteArrayWeaklyPinnedzh ( gcptr mba )
+// MutableByteArray# s -> Int#
+{
+ jump stg_isByteArrayWeaklyPinnedzh(mba);
+}
+
/* Note [LDV profiling and resizing arrays]
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
* As far as the LDV profiler is concerned arrays are "inherently used" which
=====================================
rts/RtsSymbols.c
=====================================
@@ -656,6 +656,8 @@ extern char **environ;
SymI_HasDataProto(stg_newAlignedPinnedByteArrayzh) \
SymI_HasDataProto(stg_isByteArrayPinnedzh) \
SymI_HasDataProto(stg_isMutableByteArrayPinnedzh) \
+ SymI_HasDataProto(stg_isByteArrayWeaklyPinnedzh) \
+ SymI_HasDataProto(stg_isMutableByteArrayWeaklyPinnedzh) \
SymI_HasDataProto(stg_shrinkMutableByteArrayzh) \
SymI_HasDataProto(stg_resizzeMutableByteArrayzh) \
SymI_HasDataProto(stg_shrinkSmallMutableArrayzh) \
=====================================
rts/include/stg/MiscClosures.h
=====================================
@@ -454,6 +454,8 @@ RTS_FUN_DECL(stg_newPinnedByteArrayzh);
RTS_FUN_DECL(stg_newAlignedPinnedByteArrayzh);
RTS_FUN_DECL(stg_isByteArrayPinnedzh);
RTS_FUN_DECL(stg_isMutableByteArrayPinnedzh);
+RTS_FUN_DECL(stg_isByteArrayWeaklyPinnedzh);
+RTS_FUN_DECL(stg_isMutableByteArrayWeaklyPinnedzh);
RTS_FUN_DECL(stg_shrinkMutableByteArrayzh);
RTS_FUN_DECL(stg_resizzeMutableByteArrayzh);
RTS_FUN_DECL(stg_shrinkSmallMutableArrayzh);
=====================================
testsuite/tests/rts/T13894.hs
=====================================
@@ -16,3 +16,10 @@ main = do
case isMutableByteArrayPinned# arr# of
n# -> (# s1, isTrue# n# #)
when pinned $ putStrLn "BAD"
+
+ weakly_pinned <- IO $ \s0 ->
+ case newByteArray# 1000000# s0 of
+ (# s1, arr# #) ->
+ case isMutableByteArrayWeaklyPinned# arr# of
+ n# -> (# s1, isTrue# n# #)
+ when (not weakly_pinned) $ putStrLn "BAD"
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/dbfb046ec882221f3648695fc407bd12c5e76a87
--
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/dbfb046ec882221f3648695fc407bd12c5e76a87
You're receiving this email because of your account on gitlab.haskell.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20240821/858b09eb/attachment-0001.html>
More information about the ghc-commits
mailing list