[Git][ghc/ghc][master] 2 commits: Add a few more memcpy-ish primops
Marge Bot (@marge-bot)
gitlab at gitlab.haskell.org
Tue Apr 4 05:04:39 UTC 2023
Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC
Commits:
9095e297 by Matthew Craven at 2023-04-04T01:04:10-04:00
Add a few more memcpy-ish primops
* copyMutableByteArrayNonOverlapping#
* copyAddrToAddr#
* copyAddrToAddrNonOverlapping#
* setAddrRange#
The implementations of copyBytes, moveBytes, and fillBytes
in base:Foreign.Marshal.Utils now use these new primops,
which can cause us to work a bit harder generating code for them,
resulting in the metric increase in T21839c observed by CI on
some architectures. But in exchange, we get better code!
Metric Increase:
T21839c
- - - - -
f7da530c by Matthew Craven at 2023-04-04T01:04:10-04:00
StgToCmm: Upgrade -fcheck-prim-bounds behavior
Fixes #21054. Additionally, we can now check for range overlap
when generating Cmm for primops that use memcpy internally.
- - - - -
20 changed files:
- compiler/GHC/Builtin/primops.txt.pp
- compiler/GHC/Cmm/CLabel.hs
- compiler/GHC/StgToCmm/Foreign.hs
- compiler/GHC/StgToCmm/Prim.hs
- compiler/GHC/StgToJS/Prim.hs
- libraries/base/Data/Array/Byte.hs
- libraries/base/Foreign/Marshal/Utils.hs
- libraries/ghc-prim/changelog.md
- rts/RtsMessages.c
- + testsuite/tests/codeGen/should_fail/CheckBoundsCompareByteArray2.hs
- + testsuite/tests/codeGen/should_fail/CheckBoundsCompareByteArray3.hs
- + testsuite/tests/codeGen/should_fail/CheckBoundsReadInt64Array.hs
- + testsuite/tests/codeGen/should_fail/CheckBoundsReadSmallArray.hs
- + testsuite/tests/codeGen/should_fail/CheckBoundsReadWord64Array.hs
- + testsuite/tests/codeGen/should_fail/CheckBoundsReadWord8ArrayAsWord32.hs
- + testsuite/tests/codeGen/should_fail/CheckOverlapCopyAddrToByteArray.hs
- + testsuite/tests/codeGen/should_fail/CheckOverlapCopyByteArray.hs
- testsuite/tests/codeGen/should_fail/all.T
- + testsuite/tests/codeGen/should_run/CheckBoundsOK.hs
- testsuite/tests/codeGen/should_run/all.T
Changes:
=====================================
compiler/GHC/Builtin/primops.txt.pp
=====================================
@@ -1890,13 +1890,14 @@ primop CompareByteArraysOp "compareByteArrays#" GenPrimOp
primop CopyByteArrayOp "copyByteArray#" GenPrimOp
ByteArray# -> Int# -> MutableByteArray# s -> Int# -> Int# -> State# s -> State# s
- {@'copyByteArray#' src src_ofs dst dst_ofs n@ copies the range
- starting at offset @src_ofs@ of length @n@ from the
- 'ByteArray#' @src@ to the 'MutableByteArray#' @dst@
- starting at offset @dst_ofs at . Both arrays must fully contain
- the specified ranges, but this is not checked. The two arrays must
- not be the same array in different states, but this is not checked
- either.}
+ { @'copyByteArray#' src src_ofs dst dst_ofs len@ copies the range
+ starting at offset @src_ofs@ of length @len@ from the
+ 'ByteArray#' @src@ to the 'MutableByteArray#' @dst@
+ starting at offset @dst_ofs at . Both arrays must fully contain
+ the specified ranges, but this is not checked. The two arrays must
+ not be the same array in different states, but this is not checked
+ either.
+ }
with
has_side_effects = True
code_size = { primOpCodeSizeForeignCall + 4}
@@ -1904,10 +1905,30 @@ primop CopyByteArrayOp "copyByteArray#" GenPrimOp
primop CopyMutableByteArrayOp "copyMutableByteArray#" GenPrimOp
MutableByteArray# s -> Int# -> MutableByteArray# s -> Int# -> Int# -> State# s -> State# s
- {Copy a range of the first MutableByteArray\# to the specified region in the second MutableByteArray\#.
- Both arrays must fully contain the specified ranges, but this is not checked. The regions are
- allowed to overlap, although this is only possible when the same array is provided
- as both the source and the destination.}
+ { @'copyMutableByteArray#' src src_ofs dst dst_ofs len@ copies the
+ range starting at offset @src_ofs@ of length @len@ from the
+ 'MutableByteArray#' @src@ to the 'MutableByteArray#' @dst@
+ starting at offset @dst_ofs at . Both arrays must fully contain the
+ specified ranges, but this is not checked. The regions are
+ allowed to overlap, although this is only possible when the same
+ array is provided as both the source and the destination.
+ }
+ with
+ has_side_effects = True
+ code_size = { primOpCodeSizeForeignCall + 4 }
+ can_fail = True
+
+primop CopyMutableByteArrayNonOverlappingOp "copyMutableByteArrayNonOverlapping#" GenPrimOp
+ MutableByteArray# s -> Int# -> MutableByteArray# s -> Int# -> Int# -> State# s -> State# s
+ { @'copyMutableByteArrayNonOverlapping#' src src_ofs dst dst_ofs len@
+ copies the range starting at offset @src_ofs@ of length @len@ from
+ the 'MutableByteArray#' @src@ to the 'MutableByteArray#' @dst@
+ starting at offset @dst_ofs at . Both arrays must fully contain the
+ specified ranges, but this is not checked. The regions are /not/
+ allowed to overlap, but this is also not checked.
+
+ @since 0.11.0
+ }
with
has_side_effects = True
code_size = { primOpCodeSizeForeignCall + 4 }
@@ -1922,7 +1943,7 @@ primop CopyByteArrayToAddrOp "copyByteArrayToAddr#" GenPrimOp
either.}
with
has_side_effects = True
- code_size = { primOpCodeSizeForeignCall + 4}
+ code_size = { primOpCodeSizeForeignCall + 4 }
can_fail = True
primop CopyMutableByteArrayToAddrOp "copyMutableByteArrayToAddr#" GenPrimOp
@@ -1934,7 +1955,7 @@ primop CopyMutableByteArrayToAddrOp "copyMutableByteArrayToAddr#" GenPrimOp
pinned), but this is not checked either.}
with
has_side_effects = True
- code_size = { primOpCodeSizeForeignCall + 4}
+ code_size = { primOpCodeSizeForeignCall + 4 }
can_fail = True
primop CopyAddrToByteArrayOp "copyAddrToByteArray#" GenPrimOp
@@ -1946,7 +1967,38 @@ primop CopyAddrToByteArrayOp "copyAddrToByteArray#" GenPrimOp
but this is not checked either.}
with
has_side_effects = True
- code_size = { primOpCodeSizeForeignCall + 4}
+ code_size = { primOpCodeSizeForeignCall + 4 }
+ can_fail = True
+
+primop CopyAddrToAddrOp "copyAddrToAddr#" GenPrimOp
+ Addr# -> Addr# -> Int# -> State# RealWorld -> State# RealWorld
+ { @'copyAddrToAddr#' src dest len@ copies @len@ bytes
+ from @src@ to @dest at . These two memory ranges are allowed to overlap.
+
+ Analogous to the standard C function @memmove@, but with a different
+ argument order.
+
+ @since 0.11.0
+ }
+ with
+ has_side_effects = True
+ code_size = { primOpCodeSizeForeignCall }
+ can_fail = True
+
+primop CopyAddrToAddrNonOverlappingOp "copyAddrToAddrNonOverlapping#" GenPrimOp
+ Addr# -> Addr# -> Int# -> State# RealWorld -> State# RealWorld
+ { @'copyAddrToAddrNonOverlapping#' src dest len@ copies @len@ bytes
+ from @src@ to @dest at . As the name suggests, these two memory ranges
+ /must not overlap/, although this pre-condition is not checked.
+
+ Analogous to the standard C function @memcpy@, but with a different
+ argument order.
+
+ @since 0.11.0
+ }
+ with
+ has_side_effects = True
+ code_size = { primOpCodeSizeForeignCall }
can_fail = True
primop SetByteArrayOp "setByteArray#" GenPrimOp
@@ -1958,6 +2010,21 @@ primop SetByteArrayOp "setByteArray#" GenPrimOp
code_size = { primOpCodeSizeForeignCall + 4 }
can_fail = True
+primop SetAddrRangeOp "setAddrRange#" GenPrimOp
+ Addr# -> Int# -> Int# -> State# RealWorld -> State# RealWorld
+ { @'setAddrRange#' dest len c@ sets all of the bytes in
+ @[dest, dest+len)@ to the value @c at .
+
+ Analogous to the standard C function @memset@, but with a different
+ argument order.
+
+ @since 0.11.0
+ }
+ with
+ has_side_effects = True
+ code_size = { primOpCodeSizeForeignCall }
+ can_fail = True
+
-- Atomic operations
primop AtomicReadByteArrayOp_Int "atomicReadIntArray#" GenPrimOp
=====================================
compiler/GHC/Cmm/CLabel.hs
=====================================
@@ -65,6 +65,7 @@ module GHC.Cmm.CLabel (
mkSMAP_DIRTY_infoLabel,
mkBadAlignmentLabel,
mkOutOfBoundsAccessLabel,
+ mkMemcpyRangeOverlapLabel,
mkArrWords_infoLabel,
mkSRTInfoLabel,
@@ -649,7 +650,8 @@ mkDirty_MUT_VAR_Label,
mkCAFBlackHoleInfoTableLabel,
mkSMAP_FROZEN_CLEAN_infoLabel, mkSMAP_FROZEN_DIRTY_infoLabel,
mkSMAP_DIRTY_infoLabel, mkBadAlignmentLabel,
- mkOutOfBoundsAccessLabel, mkMUT_VAR_CLEAN_infoLabel :: CLabel
+ mkOutOfBoundsAccessLabel, mkMemcpyRangeOverlapLabel,
+ mkMUT_VAR_CLEAN_infoLabel :: CLabel
mkDirty_MUT_VAR_Label = mkForeignLabel (fsLit "dirty_MUT_VAR") Nothing ForeignLabelInExternalPackage IsFunction
mkNonmovingWriteBarrierEnabledLabel
= CmmLabel rtsUnitId (NeedExternDecl False) (fsLit "nonmoving_write_barrier_enabled") CmmData
@@ -667,7 +669,8 @@ mkSMAP_FROZEN_CLEAN_infoLabel = CmmLabel rtsUnitId (NeedExternDecl False) (fsL
mkSMAP_FROZEN_DIRTY_infoLabel = CmmLabel rtsUnitId (NeedExternDecl False) (fsLit "stg_SMALL_MUT_ARR_PTRS_FROZEN_DIRTY") CmmInfo
mkSMAP_DIRTY_infoLabel = CmmLabel rtsUnitId (NeedExternDecl False) (fsLit "stg_SMALL_MUT_ARR_PTRS_DIRTY") CmmInfo
mkBadAlignmentLabel = CmmLabel rtsUnitId (NeedExternDecl False) (fsLit "stg_badAlignment") CmmEntry
-mkOutOfBoundsAccessLabel = mkForeignLabel (fsLit "rtsOutOfBoundsAccess") Nothing ForeignLabelInExternalPackage IsFunction
+mkOutOfBoundsAccessLabel = mkForeignLabel (fsLit "rtsOutOfBoundsAccess") Nothing ForeignLabelInExternalPackage IsFunction
+mkMemcpyRangeOverlapLabel = mkForeignLabel (fsLit "rtsMemcpyRangeOverlap") Nothing ForeignLabelInExternalPackage IsFunction
mkMUT_VAR_CLEAN_infoLabel = CmmLabel rtsUnitId (NeedExternDecl False) (fsLit "stg_MUT_VAR_CLEAN") CmmInfo
mkSRTInfoLabel :: Int -> CLabel
=====================================
compiler/GHC/StgToCmm/Foreign.hs
=====================================
@@ -8,7 +8,9 @@
module GHC.StgToCmm.Foreign (
cgForeignCall,
- emitPrimCall, emitCCall,
+ emitPrimCall,
+ emitCCall,
+ emitCCallNeverReturns,
emitForeignCall,
emitSaveThreadState,
saveThreadState,
@@ -194,17 +196,31 @@ continuation, resulting in just one proc point instead of two. Yay!
-}
-emitCCall :: [(CmmFormal,ForeignHint)]
- -> CmmExpr
- -> [(CmmActual,ForeignHint)]
- -> FCode ()
-emitCCall hinted_results fn hinted_args
+emitCCall' :: CmmReturnInfo
+ -> [(CmmFormal,ForeignHint)]
+ -> CmmExpr
+ -> [(CmmActual,ForeignHint)]
+ -> FCode ()
+emitCCall' ret_info hinted_results fn hinted_args
= void $ emitForeignCall PlayRisky results target args
where
(args, arg_hints) = unzip hinted_args
(results, result_hints) = unzip hinted_results
target = ForeignTarget fn fc
- fc = ForeignConvention CCallConv arg_hints result_hints CmmMayReturn
+ fc = ForeignConvention CCallConv arg_hints result_hints ret_info
+
+emitCCall :: [(CmmFormal,ForeignHint)]
+ -> CmmExpr
+ -> [(CmmActual,ForeignHint)]
+ -> FCode ()
+emitCCall = emitCCall' CmmMayReturn
+
+emitCCallNeverReturns
+ :: [(CmmFormal,ForeignHint)]
+ -> CmmExpr
+ -> [(CmmActual,ForeignHint)]
+ -> FCode ()
+emitCCallNeverReturns = emitCCall' CmmNeverReturns
emitPrimCall :: [CmmFormal] -> CallishMachOp -> [CmmActual] -> FCode ()
=====================================
compiler/GHC/StgToCmm/Prim.hs
=====================================
@@ -311,7 +311,7 @@ emitPrimOp cfg primop =
-- #define sizzeofByteArrayzh(r,a) \
-- r = ((StgArrBytes *)(a))->bytes
SizeofByteArrayOp -> \[arg] -> opIntoRegs $ \[res] ->
- emit $ mkAssign (CmmLocal res) (cmmLoadIndexW platform arg (fixedHdrSizeW profile) (bWord platform))
+ emitAssign (CmmLocal res) (byteArraySize platform profile arg)
-- #define sizzeofMutableByteArrayzh(r,a) \
-- r = ((StgArrBytes *)(a))->bytes
@@ -320,7 +320,7 @@ emitPrimOp cfg primop =
-- #define getSizzeofMutableByteArrayzh(r,a) \
-- r = ((StgArrBytes *)(a))->bytes
GetSizeofMutableByteArrayOp -> \[arg] -> opIntoRegs $ \[res] ->
- emitAssign (CmmLocal res) (cmmLoadIndexW platform arg (fixedHdrSizeW profile) (bWord platform))
+ emitAssign (CmmLocal res) (byteArraySize platform profile arg)
-- #define touchzh(o) /* nothing */
@@ -394,15 +394,10 @@ emitPrimOp cfg primop =
-- Getting the size of pointer arrays
SizeofArrayOp -> \[arg] -> opIntoRegs $ \[res] ->
- emit $ mkAssign (CmmLocal res) (cmmLoadIndexW platform arg
- (fixedHdrSizeW profile + bytesToWordsRoundUp platform (pc_OFFSET_StgMutArrPtrs_ptrs (platformConstants platform)))
- (bWord platform))
+ emitAssign (CmmLocal res) (ptrArraySize platform profile arg)
SizeofMutableArrayOp -> emitPrimOp cfg SizeofArrayOp
SizeofSmallArrayOp -> \[arg] -> opIntoRegs $ \[res] ->
- emit $ mkAssign (CmmLocal res)
- (cmmLoadIndexW platform arg
- (fixedHdrSizeW profile + bytesToWordsRoundUp platform (pc_OFFSET_StgSmallMutArrPtrs_ptrs (platformConstants platform)))
- (bWord platform))
+ emitAssign (CmmLocal res) (smallPtrArraySize platform profile arg)
SizeofSmallMutableArrayOp -> emitPrimOp cfg SizeofSmallArrayOp
GetSizeofSmallMutableArrayOp -> emitPrimOp cfg SizeofSmallArrayOp
@@ -715,14 +710,22 @@ emitPrimOp cfg primop =
doCopyByteArrayOp src src_off dst dst_off n
CopyMutableByteArrayOp -> \[src,src_off,dst,dst_off,n] -> opIntoRegs $ \[] ->
doCopyMutableByteArrayOp src src_off dst dst_off n
+ CopyMutableByteArrayNonOverlappingOp -> \[src,src_off,dst,dst_off,n] -> opIntoRegs $ \[] ->
+ doCopyMutableByteArrayNonOverlappingOp src src_off dst dst_off n
CopyByteArrayToAddrOp -> \[src,src_off,dst,n] -> opIntoRegs $ \[] ->
doCopyByteArrayToAddrOp src src_off dst n
CopyMutableByteArrayToAddrOp -> \[src,src_off,dst,n] -> opIntoRegs $ \[] ->
doCopyMutableByteArrayToAddrOp src src_off dst n
CopyAddrToByteArrayOp -> \[src,dst,dst_off,n] -> opIntoRegs $ \[] ->
doCopyAddrToByteArrayOp src dst dst_off n
+ CopyAddrToAddrOp -> \[src,dst,n] -> opIntoRegs $ \[] ->
+ doCopyAddrToAddrOp src dst n
+ CopyAddrToAddrNonOverlappingOp -> \[src,dst,n] -> opIntoRegs $ \[] ->
+ doCopyAddrToAddrNonOverlappingOp src dst n
SetByteArrayOp -> \[ba,off,len,c] -> opIntoRegs $ \[] ->
doSetByteArrayOp ba off len c
+ SetAddrRangeOp -> \[dst,len,c] -> opIntoRegs $ \[] ->
+ doSetAddrRangeOp dst len c
-- Comparing byte arrays
CompareByteArraysOp -> \[ba1,ba1_off,ba2,ba2_off,n] -> opIntoRegs $ \[res] ->
@@ -2097,8 +2100,8 @@ doWriteOffAddrOp :: Maybe MachOp
-> [LocalReg]
-> [CmmExpr]
-> FCode ()
-doWriteOffAddrOp maybe_pre_write_cast idx_ty [] [addr,idx,val]
- = mkBasicIndexedWrite 0 maybe_pre_write_cast addr idx_ty idx val
+doWriteOffAddrOp castOp idx_ty [] [addr,idx, val]
+ = mkBasicIndexedWrite 0 addr idx_ty idx (maybeCast castOp val)
doWriteOffAddrOp _ _ _ _
= panic "GHC.StgToCmm.Prim: doWriteOffAddrOp"
@@ -2107,11 +2110,12 @@ doWriteByteArrayOp :: Maybe MachOp
-> [LocalReg]
-> [CmmExpr]
-> FCode ()
-doWriteByteArrayOp maybe_pre_write_cast idx_ty [] [addr,idx,val]
+doWriteByteArrayOp castOp idx_ty [] [addr,idx, rawVal]
= do profile <- getProfile
platform <- getPlatform
+ let val = maybeCast castOp rawVal
doByteArrayBoundsCheck idx addr idx_ty (cmmExprType platform val)
- mkBasicIndexedWrite (arrWordsHdrSize profile) maybe_pre_write_cast addr idx_ty idx val
+ mkBasicIndexedWrite (arrWordsHdrSize profile) addr idx_ty idx val
doWriteByteArrayOp _ _ _ _
= panic "GHC.StgToCmm.Prim: doWriteByteArrayOp"
@@ -2134,7 +2138,7 @@ doWritePtrArrayOp addr idx val
-- referred to by val have happened before we write val into the array.
-- See #12469 for details.
emitPrimCall [] MO_WriteBarrier []
- mkBasicIndexedWrite hdr_size Nothing addr ty idx val
+ mkBasicIndexedWrite hdr_size addr ty idx val
emit (setInfo addr (CmmLit (CmmLabel mkMAP_DIRTY_infoLabel)))
-- the write barrier. We must write a byte into the mark table:
@@ -2142,15 +2146,10 @@ doWritePtrArrayOp addr idx val
emit $ mkStore (
cmmOffsetExpr platform
(cmmOffsetExprW platform (cmmOffsetB platform addr hdr_size)
- (loadArrPtrsSize profile addr))
+ (ptrArraySize platform profile addr))
(CmmMachOp (mo_wordUShr platform) [idx, mkIntExpr platform (pc_MUT_ARR_PTRS_CARD_BITS (platformConstants platform))])
) (CmmLit (CmmInt 1 W8))
-loadArrPtrsSize :: Profile -> CmmExpr -> CmmExpr
-loadArrPtrsSize profile addr = cmmLoadBWord platform (cmmOffsetB platform addr off)
- where off = fixedHdrSize profile + pc_OFFSET_StgMutArrPtrs_ptrs (profileConstants profile)
- platform = profilePlatform profile
-
mkBasicIndexedRead :: AlignmentSpec
-> ByteOff -- Initial offset in bytes
-> Maybe MachOp -- Optional result cast
@@ -2169,18 +2168,15 @@ mkBasicIndexedRead alignment off (Just cast) ty res base idx_ty idx
cmmLoadIndexOffExpr platform alignment off ty base idx_ty idx])
mkBasicIndexedWrite :: ByteOff -- Initial offset in bytes
- -> Maybe MachOp -- Optional value cast
-> CmmExpr -- Base address
-> CmmType -- Type of element by which we are indexing
-> CmmExpr -- Index
-> CmmExpr -- Value to write
-> FCode ()
-mkBasicIndexedWrite off Nothing base idx_ty idx val
+mkBasicIndexedWrite off base idx_ty idx val
= do platform <- getPlatform
let alignment = alignmentFromTypes (cmmExprType platform val) idx_ty
emitStore' alignment (cmmIndexOffExpr platform off (typeWidth idx_ty) base idx) val
-mkBasicIndexedWrite off (Just cast) base idx_ty idx val
- = mkBasicIndexedWrite off Nothing base idx_ty idx (CmmMachOp cast [val])
-- ----------------------------------------------------------------------------
-- Misc utils
@@ -2208,6 +2204,30 @@ cmmLoadIndexOffExpr platform alignment off ty base idx_ty idx
setInfo :: CmmExpr -> CmmExpr -> CmmAGraph
setInfo closure_ptr info_ptr = mkStore closure_ptr info_ptr
+maybeCast :: Maybe MachOp -> CmmExpr -> CmmExpr
+maybeCast Nothing val = val
+maybeCast (Just cast) val = CmmMachOp cast [val]
+
+ptrArraySize :: Platform -> Profile -> CmmExpr -> CmmExpr
+ptrArraySize platform profile arr =
+ cmmLoadBWord platform (cmmOffsetB platform arr sz_off)
+ where sz_off = fixedHdrSize profile
+ + pc_OFFSET_StgMutArrPtrs_ptrs (platformConstants platform)
+
+smallPtrArraySize :: Platform -> Profile -> CmmExpr -> CmmExpr
+smallPtrArraySize platform profile arr =
+ cmmLoadBWord platform (cmmOffsetB platform arr sz_off)
+ where sz_off = fixedHdrSize profile
+ + pc_OFFSET_StgSmallMutArrPtrs_ptrs (platformConstants platform)
+
+byteArraySize :: Platform -> Profile -> CmmExpr -> CmmExpr
+byteArraySize platform profile arr =
+ cmmLoadBWord platform (cmmOffsetB platform arr sz_off)
+ where sz_off = fixedHdrSize profile
+ + pc_OFFSET_StgArrBytes_bytes (platformConstants platform)
+
+
+
------------------------------------------------------------------------------
-- Helpers for translating vector primops.
@@ -2453,10 +2473,9 @@ doCompareByteArraysOp res ba1 ba1_off ba2 ba2_off n = do
profile <- getProfile
platform <- getPlatform
- ifNonZero n $ do
- let last_touched_idx off = cmmOffsetB platform (cmmAddWord platform off n) (-1)
- doByteArrayBoundsCheck (last_touched_idx ba1_off) ba1 b8 b8
- doByteArrayBoundsCheck (last_touched_idx ba2_off) ba2 b8 b8
+ whenCheckBounds $ ifNonZero n $ do
+ emitRangeBoundsCheck ba1_off n (byteArraySize platform profile ba1)
+ emitRangeBoundsCheck ba2_off n (byteArraySize platform profile ba2)
ba1_p <- assignTempE $ cmmOffsetExpr platform (cmmOffsetB platform ba1 (arrWordsHdrSize profile)) ba1_off
ba2_p <- assignTempE $ cmmOffsetExpr platform (cmmOffsetB platform ba2 (arrWordsHdrSize profile)) ba2_off
@@ -2519,7 +2538,7 @@ doCopyByteArrayOp = emitCopyByteArray copy
-- Copy data (we assume the arrays aren't overlapping since
-- they're of different types)
copy _src _dst dst_p src_p bytes align =
- emitMemcpyCall dst_p src_p bytes align
+ emitCheckedMemcpyCall dst_p src_p bytes align
-- | Takes a source 'MutableByteArray#', an offset in the source
-- array, a destination 'MutableByteArray#', an offset into the
@@ -2540,6 +2559,19 @@ doCopyMutableByteArrayOp = emitCopyByteArray copy
(getCode $ emitMemcpyCall dst_p src_p bytes align)
emit =<< mkCmmIfThenElse (cmmEqWord platform src dst) moveCall cpyCall
+-- | Takes a source 'MutableByteArray#', an offset in the source
+-- array, a destination 'MutableByteArray#', an offset into the
+-- destination array, and the number of bytes to copy. Copies the
+-- given number of bytes from the source array to the destination
+-- array. Assumes the two ranges are disjoint
+doCopyMutableByteArrayNonOverlappingOp :: CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr
+ -> FCode ()
+doCopyMutableByteArrayNonOverlappingOp = emitCopyByteArray copy
+ where
+ copy _src _dst dst_p src_p bytes align = do
+ emitCheckedMemcpyCall dst_p src_p bytes align
+
+
emitCopyByteArray :: (CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr
-> Alignment -> FCode ())
-> CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr
@@ -2548,10 +2580,9 @@ emitCopyByteArray copy src src_off dst dst_off n = do
profile <- getProfile
platform <- getPlatform
- ifNonZero n $ do
- let last_touched_idx off = cmmOffsetB platform (cmmAddWord platform off n) (-1)
- doByteArrayBoundsCheck (last_touched_idx src_off) src b8 b8
- doByteArrayBoundsCheck (last_touched_idx dst_off) dst b8 b8
+ whenCheckBounds $ ifNonZero n $ do
+ emitRangeBoundsCheck src_off n (byteArraySize platform profile src)
+ emitRangeBoundsCheck dst_off n (byteArraySize platform profile dst)
let byteArrayAlignment = wordAlignment platform
srcOffAlignment = cmmExprAlignment src_off
@@ -2569,11 +2600,10 @@ doCopyByteArrayToAddrOp src src_off dst_p bytes = do
-- Use memcpy (we are allowed to assume the arrays aren't overlapping)
profile <- getProfile
platform <- getPlatform
- ifNonZero bytes $ do
- let last_touched_idx off = cmmOffsetB platform (cmmAddWord platform off bytes) (-1)
- doByteArrayBoundsCheck (last_touched_idx src_off) src b8 b8
+ whenCheckBounds $ ifNonZero bytes $ do
+ emitRangeBoundsCheck src_off bytes (byteArraySize platform profile src)
src_p <- assignTempE $ cmmOffsetExpr platform (cmmOffsetB platform src (arrWordsHdrSize profile)) src_off
- emitMemcpyCall dst_p src_p bytes (mkAlignment 1)
+ emitCheckedMemcpyCall dst_p src_p bytes (mkAlignment 1)
-- | Takes a source 'MutableByteArray#', an offset in the source array, a
-- destination 'Addr#', and the number of bytes to copy. Copies the given
@@ -2590,33 +2620,51 @@ doCopyAddrToByteArrayOp src_p dst dst_off bytes = do
-- Use memcpy (we are allowed to assume the arrays aren't overlapping)
profile <- getProfile
platform <- getPlatform
- ifNonZero bytes $ do
- let last_touched_idx off = cmmOffsetB platform (cmmAddWord platform off bytes) (-1)
- doByteArrayBoundsCheck (last_touched_idx dst_off) dst b8 b8
+ whenCheckBounds $ ifNonZero bytes $ do
+ emitRangeBoundsCheck dst_off bytes (byteArraySize platform profile dst)
dst_p <- assignTempE $ cmmOffsetExpr platform (cmmOffsetB platform dst (arrWordsHdrSize profile)) dst_off
- emitMemcpyCall dst_p src_p bytes (mkAlignment 1)
+ emitCheckedMemcpyCall dst_p src_p bytes (mkAlignment 1)
+
+-- | Takes a source 'Addr#', a destination 'Addr#', and the number of
+-- bytes to copy. Copies the given number of bytes from the source
+-- memory region to the destination array.
+doCopyAddrToAddrOp :: CmmExpr -> CmmExpr -> CmmExpr -> FCode ()
+doCopyAddrToAddrOp src_p dst_p bytes = do
+ -- Use memmove; the ranges may overlap
+ emitMemmoveCall dst_p src_p bytes (mkAlignment 1)
+
+-- | Takes a source 'Addr#', a destination 'Addr#', and the number of
+-- bytes to copy. Copies the given number of bytes from the source
+-- memory region to the destination region. The regions may not overlap.
+doCopyAddrToAddrNonOverlappingOp :: CmmExpr -> CmmExpr -> CmmExpr -> FCode ()
+doCopyAddrToAddrNonOverlappingOp src_p dst_p bytes = do
+ -- Use memcpy; the ranges may not overlap
+ emitCheckedMemcpyCall dst_p src_p bytes (mkAlignment 1)
ifNonZero :: CmmExpr -> FCode () -> FCode ()
ifNonZero e it = do
platform <- getPlatform
let pred = cmmNeWord platform e (zeroExpr platform)
code <- getCode it
- emit =<< mkCmmIfThen' pred code (Just False)
+ emit =<< mkCmmIfThen' pred code (Just True)
+ -- This function is used for range operation bounds-checks;
+ -- Most calls to those ops will not have range length zero.
+
-- ----------------------------------------------------------------------------
-- Setting byte arrays
-- | Takes a 'MutableByteArray#', an offset into the array, a length,
-- and a byte, and sets each of the selected bytes in the array to the
--- character.
+-- given byte.
doSetByteArrayOp :: CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr
-> FCode ()
doSetByteArrayOp ba off len c = do
profile <- getProfile
platform <- getPlatform
- doByteArrayBoundsCheck off ba b8 b8
- doByteArrayBoundsCheck (cmmOffset platform (cmmAddWord platform off len) (-1)) ba b8 b8
+ whenCheckBounds $ ifNonZero len $
+ emitRangeBoundsCheck off len (byteArraySize platform profile ba)
let byteArrayAlignment = wordAlignment platform -- known since BA is allocated on heap
offsetAlignment = cmmExprAlignment off
@@ -2625,6 +2673,14 @@ doSetByteArrayOp ba off len c = do
p <- assignTempE $ cmmOffsetExpr platform (cmmOffsetB platform ba (arrWordsHdrSize profile)) off
emitMemsetCall p c len align
+-- | Takes an 'Addr#', a length, and a byte, and sets each of the
+-- selected bytes in memory to the given byte.
+doSetAddrRangeOp :: CmmExpr -> CmmExpr -> CmmExpr
+ -> FCode ()
+doSetAddrRangeOp dst len c = do
+ emitMemsetCall dst c len (mkAlignment 1)
+
+
-- ----------------------------------------------------------------------------
-- Allocating arrays
@@ -2687,7 +2743,7 @@ doCopyArrayOp = emitCopyArray copy
-- they're of different types)
copy _src _dst dst_p src_p bytes =
do platform <- getPlatform
- emitMemcpyCall dst_p src_p (mkIntExpr platform bytes)
+ emitCheckedMemcpyCall dst_p src_p (mkIntExpr platform bytes)
(wordAlignment platform)
@@ -2729,8 +2785,11 @@ emitCopyArray copy src0 src_off dst0 dst_off0 n =
dst <- assignTempE dst0
dst_off <- assignTempE dst_off0
- doPtrArrayBoundsCheck (cmmAddWord platform src_off (mkIntExpr platform n)) src
- doPtrArrayBoundsCheck (cmmAddWord platform dst_off (mkIntExpr platform n)) dst
+ whenCheckBounds $ do
+ emitRangeBoundsCheck src_off (mkIntExpr platform n)
+ (ptrArraySize platform profile src)
+ emitRangeBoundsCheck dst_off (mkIntExpr platform n)
+ (ptrArraySize platform profile dst)
-- Nonmoving collector write barrier
emitCopyUpdRemSetPush platform (arrPtrsHdrSize profile) dst dst_off n
@@ -2749,7 +2808,7 @@ emitCopyArray copy src0 src_off dst0 dst_off0 n =
-- The base address of the destination card table
dst_cards_p <- assignTempE $ cmmOffsetExprW platform dst_elems_p
- (loadArrPtrsSize profile dst)
+ (ptrArraySize platform profile dst)
emitSetCards dst_off dst_cards_p n
@@ -2761,7 +2820,7 @@ doCopySmallArrayOp = emitCopySmallArray copy
-- they're of different types)
copy _src _dst dst_p src_p bytes =
do platform <- getPlatform
- emitMemcpyCall dst_p src_p (mkIntExpr platform bytes)
+ emitCheckedMemcpyCall dst_p src_p (mkIntExpr platform bytes)
(wordAlignment platform)
@@ -2798,9 +2857,11 @@ emitCopySmallArray copy src0 src_off dst0 dst_off n =
src <- assignTempE src0
dst <- assignTempE dst0
- when (n /= 0) $ do
- doSmallPtrArrayBoundsCheck (cmmAddWord platform src_off (mkIntExpr platform n)) src
- doSmallPtrArrayBoundsCheck (cmmAddWord platform dst_off (mkIntExpr platform n)) dst
+ whenCheckBounds $ do
+ emitRangeBoundsCheck src_off (mkIntExpr platform n)
+ (smallPtrArraySize platform profile src)
+ emitRangeBoundsCheck dst_off (mkIntExpr platform n)
+ (smallPtrArraySize platform profile dst)
-- Nonmoving collector write barrier
emitCopyUpdRemSetPush platform (smallArrPtrsHdrSize profile) dst dst_off n
@@ -2895,7 +2956,7 @@ emitCloneSmallArray info_p res_r src src_off n = do
emit $ mkAssign (CmmLocal res_r) (CmmReg arr)
--- | Takes and offset in the destination array, the base address of
+-- | Takes an offset in the destination array, the base address of
-- the card table, and the number of elements affected (*not* the
-- number of cards). The number of elements may not be zero.
-- Marks the relevant cards as dirty.
@@ -2948,7 +3009,7 @@ doWriteSmallPtrArrayOp addr idx val = do
whenUpdRemSetEnabled $ emitUpdRemSetPush (CmmReg (CmmLocal tmp))
emitPrimCall [] MO_WriteBarrier [] -- #12469
- mkBasicIndexedWrite (smallArrPtrsHdrSize profile) Nothing addr ty idx val
+ mkBasicIndexedWrite (smallArrPtrsHdrSize profile) addr ty idx val
emit (setInfo addr (CmmLit (CmmLabel mkSMAP_DIRTY_infoLabel)))
------------------------------------------------------------------------------
@@ -3074,6 +3135,26 @@ emitMemcpyCall dst src n align =
(MO_Memcpy (alignmentBytes align))
[ dst, src, n ]
+-- | Emit a call to @memcpy@, but check for range
+-- overlap when -fcheck-prim-bounds is on.
+emitCheckedMemcpyCall :: CmmExpr -> CmmExpr -> CmmExpr -> Alignment -> FCode ()
+emitCheckedMemcpyCall dst src n align = do
+ whenCheckBounds (getPlatform >>= doCheck)
+ emitMemcpyCall dst src n align
+ where
+ doCheck platform = do
+ overlapCheckFailed <- getCode $
+ emitCCallNeverReturns [] (mkLblExpr mkMemcpyRangeOverlapLabel) []
+ emit =<< mkCmmIfThen' rangesOverlap overlapCheckFailed (Just False)
+ where
+ rangesOverlap = (checkDiff dst src `or` checkDiff src dst) `ne` zero
+ checkDiff p q = (p `minus` q) `uLT` n
+ or = cmmOrWord platform
+ minus = cmmSubWord platform
+ uLT = cmmULtWord platform
+ ne = cmmNeWord platform
+ zero = zeroExpr platform
+
-- | Emit a call to @memmove at .
emitMemmoveCall :: CmmExpr -> CmmExpr -> CmmExpr -> Alignment -> FCode ()
emitMemmoveCall dst src n align =
@@ -3168,50 +3249,68 @@ emitCtzCall res x width =
-- Array bounds checking
---------------------------------------------------------------------------
-doBoundsCheck :: CmmExpr -- ^ accessed index
- -> CmmExpr -- ^ array size (in elements)
- -> FCode ()
-doBoundsCheck idx sz = do
- do_bounds_check <- stgToCmmDoBoundsCheck <$> getStgToCmmConfig
- platform <- getPlatform
- when do_bounds_check (doCheck platform)
- where
- doCheck platform = do
- boundsCheckFailed <- getCode $ emitCCall [] (mkLblExpr mkOutOfBoundsAccessLabel) []
- emit =<< mkCmmIfThen' isOutOfBounds boundsCheckFailed (Just False)
- where
- uGE = cmmUGeWord platform
- and = cmmAndWord platform
- zero = zeroExpr platform
- ne = cmmNeWord platform
- isOutOfBounds = ((idx `uGE` sz) `and` (idx `ne` zero)) `ne` zero
+whenCheckBounds :: FCode () -> FCode ()
+whenCheckBounds a = do
+ config <- getStgToCmmConfig
+ case stgToCmmDoBoundsCheck config of
+ False -> pure ()
+ True -> a
--- We want to make sure that the array size computation is pushed into the
--- Opt_DoBoundsChecking check to avoid regregressing compiler performance when
--- it's disabled.
-{-# INLINE doBoundsCheck #-}
+emitBoundsCheck :: CmmExpr -- ^ accessed index
+ -> CmmExpr -- ^ array size (in elements)
+ -> FCode ()
+emitBoundsCheck idx sz = do
+ assertM (stgToCmmDoBoundsCheck <$> getStgToCmmConfig)
+ platform <- getPlatform
+ boundsCheckFailed <- getCode $
+ emitCCallNeverReturns [] (mkLblExpr mkOutOfBoundsAccessLabel) []
+ let isOutOfBounds = cmmUGeWord platform idx sz
+ emit =<< mkCmmIfThen' isOutOfBounds boundsCheckFailed (Just False)
+
+emitRangeBoundsCheck :: CmmExpr -- ^ first accessed index
+ -> CmmExpr -- ^ number of accessed indices (non-zero)
+ -> CmmExpr -- ^ array size (in elements)
+ -> FCode ()
+emitRangeBoundsCheck idx len arrSizeExpr = do
+ assertM (stgToCmmDoBoundsCheck <$> getStgToCmmConfig)
+ config <- getStgToCmmConfig
+ platform <- getPlatform
+ arrSize <- assignTempE arrSizeExpr
+ -- arrSizeExpr is probably a load we don't want to duplicate
+ rangeTooLargeReg <- newTemp (bWord platform)
+ lastSafeIndexReg <- newTemp (bWord platform)
+ _ <- withSequel (AssignTo [lastSafeIndexReg, rangeTooLargeReg] False) $
+ cmmPrimOpApp config WordSubCOp [arrSize, len] Nothing
+ boundsCheckFailed <- getCode $
+ emitCCallNeverReturns [] (mkLblExpr mkOutOfBoundsAccessLabel) []
+ let
+ rangeTooLarge = CmmReg (CmmLocal rangeTooLargeReg)
+ lastSafeIndex = CmmReg (CmmLocal lastSafeIndexReg)
+ badStartIndex = (idx `uGT` lastSafeIndex)
+ isOutOfBounds = (rangeTooLarge `or` badStartIndex) `neq` zero
+ uGT = cmmUGtWord platform
+ or = cmmOrWord platform
+ neq = cmmNeWord platform
+ zero = zeroExpr platform
+ emit =<< mkCmmIfThen' isOutOfBounds boundsCheckFailed (Just False)
doPtrArrayBoundsCheck
:: CmmExpr -- ^ accessed index (in bytes)
-> CmmExpr -- ^ pointer to @StgMutArrPtrs@
-> FCode ()
-doPtrArrayBoundsCheck idx arr = do
+doPtrArrayBoundsCheck idx arr = whenCheckBounds $ do
profile <- getProfile
platform <- getPlatform
- let sz = cmmLoadBWord platform (cmmOffset platform arr sz_off)
- sz_off = fixedHdrSize profile + pc_OFFSET_StgMutArrPtrs_ptrs (platformConstants platform)
- doBoundsCheck idx sz
+ emitBoundsCheck idx (ptrArraySize platform profile arr)
doSmallPtrArrayBoundsCheck
:: CmmExpr -- ^ accessed index (in bytes)
-> CmmExpr -- ^ pointer to @StgMutArrPtrs@
-> FCode ()
-doSmallPtrArrayBoundsCheck idx arr = do
+doSmallPtrArrayBoundsCheck idx arr = whenCheckBounds $ do
profile <- getProfile
platform <- getPlatform
- let sz = cmmLoadBWord platform (cmmOffset platform arr sz_off)
- sz_off = fixedHdrSize profile + pc_OFFSET_StgSmallMutArrPtrs_ptrs (platformConstants platform)
- doBoundsCheck idx sz
+ emitBoundsCheck idx (smallPtrArraySize platform profile arr)
doByteArrayBoundsCheck
:: CmmExpr -- ^ accessed index (in elements)
@@ -3219,18 +3318,18 @@ doByteArrayBoundsCheck
-> CmmType -- ^ indexing type
-> CmmType -- ^ element type
-> FCode ()
-doByteArrayBoundsCheck idx arr idx_ty elem_ty = do
+doByteArrayBoundsCheck idx arr idx_ty elem_ty = whenCheckBounds $ do
profile <- getProfile
platform <- getPlatform
- let sz = cmmLoadBWord platform (cmmOffset platform arr sz_off)
- sz_off = fixedHdrSize profile + pc_OFFSET_StgArrBytes_bytes (platformConstants platform)
- elem_sz = widthInBytes $ typeWidth elem_ty
- idx_sz = widthInBytes $ typeWidth idx_ty
- -- Ensure that the last byte of the access is within the array
- idx_bytes = cmmOffsetB platform
- (cmmMulWord platform idx (mkIntExpr platform idx_sz))
- (elem_sz - 1)
- doBoundsCheck idx_bytes sz
+ let elem_w = typeWidth elem_ty
+ idx_w = typeWidth idx_ty
+ elem_sz = mkIntExpr platform $ widthInBytes elem_w
+ arr_sz = byteArraySize platform profile arr
+ effective_arr_sz =
+ cmmUShrWord platform arr_sz (mkIntExpr platform (widthInLog idx_w))
+ if elem_w == idx_w
+ then emitBoundsCheck idx effective_arr_sz -- aligned => simpler check
+ else assert (idx_w == W8) (emitRangeBoundsCheck idx elem_sz arr_sz)
---------------------------------------------------------------------------
-- Pushing to the update remembered set
=====================================
compiler/GHC/StgToJS/Prim.hs
=====================================
@@ -717,15 +717,19 @@ genPrim prof bound ty op = case op of
. boundsChecked bound a2 (Add o2 (Sub n 1))
$ appS "h$copyMutableByteArray" [a1,o1,a2,o2,n]
CopyMutableByteArrayOp -> \[] xs@[_a1,_o1,_a2,_o2,_n] -> genPrim prof bound ty CopyByteArrayOp [] xs
+ CopyMutableByteArrayNonOverlappingOp -> \[] xs@[_a1,_o1,_a2,_o2,_n] -> genPrim prof bound ty CopyByteArrayOp [] xs
CopyByteArrayToAddrOp -> \[] xs@[_a1,_o1,_a2,_o2,_n] -> genPrim prof bound ty CopyByteArrayOp [] xs
CopyMutableByteArrayToAddrOp -> \[] xs@[_a1,_o1,_a2,_o2,_n] -> genPrim prof bound ty CopyByteArrayOp [] xs
CopyAddrToByteArrayOp -> \[] xs@[_ba,_bo,_aa,_ao,_n] -> genPrim prof bound ty CopyByteArrayOp [] xs
+ CopyAddrToAddrOp -> \[] xs@[_ba,_bo,_aa,_ao,_n] -> genPrim prof bound ty CopyByteArrayOp [] xs
+ CopyAddrToAddrNonOverlappingOp -> \[] xs@[_ba,_bo,_aa,_ao,_n] -> genPrim prof bound ty CopyByteArrayOp [] xs
SetByteArrayOp -> \[] [a,o,n,v] ->
PrimInline . boundsChecked bound a (Add o (Sub n 1)) $ loopBlockS zero_ (.<. n) \i ->
[ write_u8 a (Add o i) v
, postIncrS i
]
+ SetAddrRangeOp -> \[] xs@[_a,_o,_n,_v] -> genPrim prof bound ty SetByteArrayOp [] xs
AtomicReadByteArrayOp_Int -> \[r] [a,i] -> PrimInline . boundsChecked bound a (Add i 3) $ r |= read_i32 a i
AtomicWriteByteArrayOp_Int -> \[] [a,i,v] -> PrimInline . boundsChecked bound a (Add i 3) $ write_i32 a i v
=====================================
libraries/base/Data/Array/Byte.hs
=====================================
@@ -134,7 +134,7 @@ unsafeCopyByteArray (MutableByteArray dst#) (I# doff#) (ByteArray src#) (I# soff
-- | Copy a slice from one mutable byte array to another
-- or to the same mutable byte array.
--
--- /Note:/ this function does not do bounds checking.
+-- /Note:/ this function does not do bounds or overlap checking.
unsafeCopyMutableByteArray
:: MutableByteArray s -- ^ destination array
-> Int -- ^ offset into destination array
@@ -144,7 +144,7 @@ unsafeCopyMutableByteArray
-> ST s ()
{-# INLINE unsafeCopyMutableByteArray #-}
unsafeCopyMutableByteArray (MutableByteArray dst#) (I# doff#) (MutableByteArray src#) (I# soff#) (I# sz#) =
- ST (\s# -> case copyMutableByteArray# src# soff# dst# doff# sz# s# of
+ ST (\s# -> case copyMutableByteArrayNonOverlapping# src# soff# dst# doff# sz# s# of
s'# -> (# s'#, () #))
-- | @since 4.17.0.0
=====================================
libraries/base/Foreign/Marshal/Utils.hs
=====================================
@@ -1,6 +1,9 @@
{-# LANGUAGE Trustworthy #-}
{-# LANGUAGE NoImplicitPrelude #-}
+{-# LANGUAGE MagicHash #-}
+{-# LANGUAGE UnboxedTuples #-}
+
-----------------------------------------------------------------------------
-- |
-- Module : Foreign.Marshal.Utils
@@ -50,13 +53,11 @@ module Foreign.Marshal.Utils (
) where
import Data.Maybe
-import Foreign.Ptr ( Ptr, nullPtr )
+import GHC.Ptr ( Ptr(..), nullPtr )
import Foreign.Storable ( Storable(poke) )
-import Foreign.C.Types ( CSize(..), CInt(..) )
import Foreign.Marshal.Alloc ( malloc, alloca )
-import Data.Word ( Word8 )
+import GHC.Word ( Word8(..) )
-import GHC.Real ( fromIntegral )
import GHC.Num
import GHC.Base
@@ -158,9 +159,8 @@ copyBytes
-> Ptr a -- ^ Source
-> Int -- ^ Size in bytes
-> IO ()
-copyBytes dest src size = do
- _ <- memcpy dest src (fromIntegral size)
- return ()
+copyBytes = coerce $ \(Ptr dest#) (Ptr src#) (I# size#) s
+ -> (# copyAddrToAddrNonOverlapping# src# dest# size# s, () #)
-- |Copies the given number of bytes from the second area (source) into the
-- first (destination); the copied areas /may/ overlap
@@ -170,9 +170,8 @@ moveBytes
-> Ptr a -- ^ Source
-> Int -- ^ Size in bytes
-> IO ()
-moveBytes dest src size = do
- _ <- memmove dest src (fromIntegral size)
- return ()
+moveBytes = coerce $ \(Ptr dest#) (Ptr src#) (I# size#) s
+ -> (# copyAddrToAddr# src# dest# size# s, () #)
-- Filling up memory area with required values
-- -------------------------------------------
@@ -180,16 +179,6 @@ moveBytes dest src size = do
-- |Fill a given number of bytes in memory area with a byte value.
--
-- @since 4.8.0.0
-fillBytes :: Ptr a -> Word8 -> Int -> IO ()
-fillBytes dest char size = do
- _ <- memset dest (fromIntegral char) (fromIntegral size)
- return ()
-
--- auxiliary routines
--- -------------------
-
--- |Basic C routines needed for memory copying
---
-foreign import ccall unsafe "string.h" memcpy :: Ptr a -> Ptr a -> CSize -> IO (Ptr a)
-foreign import ccall unsafe "string.h" memmove :: Ptr a -> Ptr a -> CSize -> IO (Ptr a)
-foreign import ccall unsafe "string.h" memset :: Ptr a -> CInt -> CSize -> IO (Ptr a)
+fillBytes :: Ptr a -> Word8 -> Int -> IO ()
+fillBytes = coerce $ \(Ptr dest#) (W8# byte#) (I# size#) s
+ -> (# setAddrRange# dest# size# (word2Int# (word8ToWord# byte#)) s, () #)
=====================================
libraries/ghc-prim/changelog.md
=====================================
@@ -14,6 +14,13 @@
- `sameMutVar#`, `sameTVar#`, `sameMVar#`
- `sameIOPort#`, `eqStableName#`.
+- Several new primops were added:
+
+ - `copyMutableByteArrayNonOverlapping#`
+ - `copyAddrToAddr#`
+ - `copyAddrToAddrNonOverlapping#`
+ - `setAddrRange#`
+
## 0.10.0
- Shipped with GHC 9.6.1
=====================================
rts/RtsMessages.c
=====================================
@@ -338,3 +338,12 @@ rtsOutOfBoundsAccess()
{
barf("Encountered out of bounds array access.");
}
+
+// Used by code generator
+void rtsMemcpyRangeOverlap(void) STG_NORETURN;
+
+void
+rtsMemcpyRangeOverlap()
+{
+ barf("Encountered overlapping source/destination ranges in a memcpy-using op.");
+}
=====================================
testsuite/tests/codeGen/should_fail/CheckBoundsCompareByteArray2.hs
=====================================
@@ -0,0 +1,19 @@
+{-# LANGUAGE NondecreasingIndentation #-}
+{-# LANGUAGE UnboxedTuples #-}
+{-# LANGUAGE MagicHash #-}
+
+module Main where
+
+import GHC.Exts
+import GHC.IO
+
+main :: IO ()
+main = do
+ IO $ \s0 ->
+ case newByteArray# 4# s0 of
+ (# s1, a_marr #) -> case newByteArray# 4# s1 of
+ (# s2, b_marr #) -> case unsafeFreezeByteArray# a_marr s2 of
+ (# s3, a_arr #) -> case unsafeFreezeByteArray# b_marr s2 of
+ (# s4, b_arr #) -> case compareByteArrays# a_arr (-1#) b_arr 0# 4# of
+ 0# -> (# s4, () #)
+
=====================================
testsuite/tests/codeGen/should_fail/CheckBoundsCompareByteArray3.hs
=====================================
@@ -0,0 +1,19 @@
+{-# LANGUAGE NondecreasingIndentation #-}
+{-# LANGUAGE UnboxedTuples #-}
+{-# LANGUAGE MagicHash #-}
+
+module Main where
+
+import GHC.Exts
+import GHC.IO
+
+main :: IO ()
+main = do
+ IO $ \s0 ->
+ case newByteArray# 4# s0 of
+ (# s1, a_marr #) -> case newByteArray# 4# s1 of
+ (# s2, b_marr #) -> case unsafeFreezeByteArray# a_marr s2 of
+ (# s3, a_arr #) -> case unsafeFreezeByteArray# b_marr s2 of
+ (# s4, b_arr #) -> case compareByteArrays# a_arr 2# b_arr 3# (-1#) of
+ 0# -> (# s4, () #)
+
=====================================
testsuite/tests/codeGen/should_fail/CheckBoundsReadInt64Array.hs
=====================================
@@ -0,0 +1,16 @@
+{-# LANGUAGE UnboxedTuples #-}
+{-# LANGUAGE MagicHash #-}
+
+module Main where
+
+import GHC.Exts
+import GHC.IO
+
+main :: IO ()
+main = do
+ IO $ \s0 ->
+ case newByteArray# 23# s0 of
+ (# s1, marr #) ->
+ case readInt64Array# marr 2# s1 of
+ (# s2, _n #) -> (# s2, () #)
+
=====================================
testsuite/tests/codeGen/should_fail/CheckBoundsReadSmallArray.hs
=====================================
@@ -0,0 +1,14 @@
+{-# LANGUAGE UnboxedTuples #-}
+{-# LANGUAGE MagicHash #-}
+
+module Main where
+
+import GHC.Exts
+import GHC.IO
+
+main :: IO ()
+main = do
+ IO $ \s0 ->
+ case newSmallArray# 5# () s0 of
+ (# s1, marr #) -> readSmallArray# marr (-1#) s1
+
=====================================
testsuite/tests/codeGen/should_fail/CheckBoundsReadWord64Array.hs
=====================================
@@ -0,0 +1,16 @@
+{-# LANGUAGE UnboxedTuples #-}
+{-# LANGUAGE MagicHash #-}
+
+module Main where
+
+import GHC.Exts
+import GHC.IO
+
+main :: IO ()
+main = do
+ IO $ \s0 ->
+ case newByteArray# 23# s0 of
+ (# s1, marr #) ->
+ case readWord64Array# marr (-1#) s1 of
+ (# s2, _n #) -> (# s2, () #)
+
=====================================
testsuite/tests/codeGen/should_fail/CheckBoundsReadWord8ArrayAsWord32.hs
=====================================
@@ -0,0 +1,17 @@
+{-# LANGUAGE UnboxedTuples #-}
+{-# LANGUAGE MagicHash #-}
+
+module Main where
+
+import GHC.Exts
+import GHC.IO
+
+main :: IO ()
+main = do
+ IO $ \s0 ->
+ case newByteArray# 7# s0 of
+ (# s1, marr #) ->
+ case readWord8ArrayAsWord32# marr (-3#) s1 of
+ -- only the last byte of the desired word32 is in bounds
+ (# s2, _n #) -> (# s2, () #)
+
=====================================
testsuite/tests/codeGen/should_fail/CheckOverlapCopyAddrToByteArray.hs
=====================================
@@ -0,0 +1,14 @@
+{-# LANGUAGE UnboxedTuples #-}
+{-# LANGUAGE MagicHash #-}
+
+module Main where
+
+import GHC.Exts
+import GHC.IO
+
+main :: IO ()
+main = do
+ IO $ \s0 ->
+ case newPinnedByteArray# 7# s0 of
+ (# s1, marr #) -> case mutableByteArrayContents# marr of
+ ptr -> (# copyAddrToByteArray# ptr marr 3# 4# s1, () #)
=====================================
testsuite/tests/codeGen/should_fail/CheckOverlapCopyByteArray.hs
=====================================
@@ -0,0 +1,15 @@
+{-# LANGUAGE UnboxedTuples #-}
+{-# LANGUAGE MagicHash #-}
+
+module Main where
+
+import GHC.Exts
+import GHC.IO
+
+main :: IO ()
+main = do
+ IO $ \s0 ->
+ case newByteArray# 7# s0 of
+ (# s1, marr #) -> case unsafeFreezeByteArray# marr s1 of
+ (# s2, arr #) -> (# copyByteArray# arr 3# marr 0# 4# s2, () #)
+
=====================================
testsuite/tests/codeGen/should_fail/all.T
=====================================
@@ -10,10 +10,18 @@ def check_bounds_test(name):
[ignore_stderr, exit_code(127 if opsys('mingw32') else 134)],
compile_and_run, ['-fcheck-prim-bounds'])
-check_bounds_test('CheckBoundsWriteArray')
-check_bounds_test('CheckBoundsIndexArray')
+check_bounds_test('CheckBoundsWriteArray') # Check past end
+check_bounds_test('CheckBoundsIndexArray') # Check past end
+check_bounds_test('CheckBoundsReadSmallArray') # Check before start
check_bounds_test('CheckBoundsReadInt8Array')
-check_bounds_test('CheckBoundsReadWord8ArrayAsInt32')
+check_bounds_test('CheckBoundsReadInt64Array') # read past end
+check_bounds_test('CheckBoundsReadWord64Array') # read before start
+check_bounds_test('CheckBoundsReadWord8ArrayAsInt32') # Check last byte
+check_bounds_test('CheckBoundsReadWord8ArrayAsWord32') # Check first byte
check_bounds_test('CheckBoundsCopyByteArray')
-check_bounds_test('CheckBoundsCompareByteArray')
+check_bounds_test('CheckBoundsCompareByteArray') # Check last byte, 2nd array
+check_bounds_test('CheckBoundsCompareByteArray2') # Check first byte, 1st array
+check_bounds_test('CheckBoundsCompareByteArray3') # Check negative length
+check_bounds_test('CheckOverlapCopyByteArray')
+check_bounds_test('CheckOverlapCopyAddrToByteArray')
=====================================
testsuite/tests/codeGen/should_run/CheckBoundsOK.hs
=====================================
@@ -0,0 +1,244 @@
+-- This test verifies that correct (not out-of-bounds) uses
+-- of primops that we can bounds-check with -fcheck-prim-bounds
+-- do not cause spurious bounds-checking failures.
+
+-- Currently this tests most ByteArray#, Array#, and SmallArray# operations.
+-- (Theoretically it could also test Addr# operations,
+-- since those /can/ be bounds-checked with the JS back-end.)
+
+{-# LANGUAGE CPP #-}
+
+{-# LANGUAGE GHC2021 #-}
+{-# LANGUAGE MagicHash #-}
+{-# LANGUAGE UnboxedTuples #-}
+
+module Main where
+
+import Data.Array.Byte
+import Data.Bits
+import Control.Monad
+import GHC.Exts
+import GHC.IO
+import GHC.Word
+import GHC.Int
+import GHC.Float
+import GHC.Stable
+import System.IO
+
+#define TEST_READ_WRITE(CONDITION, READ_OP, INDEX_OP, WRITE_OP) \
+ when (CONDITION) $ IO $ \s0 -> \
+ case (READ_OP) arrU# i# s0 of \
+ (# s1, v# #) -> case (WRITE_OP) arrP# i# v# s1 of \
+ s2 -> (# (WRITE_OP) arrU# i# ((INDEX_OP) arrF# i#) s2, () #)
+
+#define ALIGNED_RW(WIDTH, READ_OP, INDEX_OP, WRITE_OP) \
+ TEST_READ_WRITE(i < size `div` (WIDTH), READ_OP, INDEX_OP, WRITE_OP)
+
+#define UNALIGNED_RW(WIDTH, READ_OP, INDEX_OP, WRITE_OP) \
+ TEST_READ_WRITE(i + (WIDTH) <= size, READ_OP, INDEX_OP, WRITE_OP)
+
+#define TEST_CAS(WIDTH, CON, CAS_OP) \
+ when (i < size `div` (WIDTH)) $ IO $ \s0 -> \
+ case (0, 7) of \
+ (CON v0, CON v7) -> case (CAS_OP) arrU# i# v0 v7 s0 of \
+ (# s1, v' #) -> (# s1, () #)
+
+
+wrapEffect :: (State# RealWorld -> State# RealWorld) -> IO ()
+wrapEffect eff = IO (\s0 -> (# eff s0, () #))
+
+
+testByteArraysOfSize :: Int -> IO ()
+testByteArraysOfSize (size@(I# size#)) = do
+ let mkArr op = IO $ \s0 -> case op size# s0 of
+ (# s1, newArr #)
+ -> (# setByteArray# newArr 0# size# 123# s1,
+ MutableByteArray newArr #)
+ MutableByteArray arrU# <- mkArr newByteArray#
+ MutableByteArray arrP# <- mkArr newPinnedByteArray#
+ ByteArray arrF# <- do
+ MutableByteArray arrToFreeze <- mkArr newByteArray#
+ IO $ \s0 -> case unsafeFreezeByteArray# arrToFreeze s0 of
+ (# s1, frozenArr #) -> (# s1, ByteArray frozenArr #)
+ let !nws = finiteBitSize (0 :: Int) `div` 8
+ !bufP = mutableByteArrayContents# arrP#
+
+
+ forM_ [0..size] $ \i@(I# i#) -> do
+ -- test valid aligned read/write ops
+ -- (expressed via CPP macro because of non-uniform representations)
+ ALIGNED_RW(1, readWord8Array#, indexWord8Array#, writeWord8Array#)
+ ALIGNED_RW(2, readWord16Array#, indexWord16Array#, writeWord16Array#)
+ ALIGNED_RW(4, readWord32Array#, indexWord32Array#, writeWord32Array#)
+ ALIGNED_RW(8, readWord64Array#, indexWord64Array#, writeWord64Array#)
+ ALIGNED_RW(nws, readWordArray#, indexWordArray#, writeWordArray#)
+
+ ALIGNED_RW(1, readInt8Array#, indexInt8Array#, writeInt8Array#)
+ ALIGNED_RW(2, readInt16Array#, indexInt16Array#, writeInt16Array#)
+ ALIGNED_RW(4, readInt32Array#, indexInt32Array#, writeInt32Array#)
+ ALIGNED_RW(8, readInt64Array#, indexInt64Array#, writeInt64Array#)
+ ALIGNED_RW(nws, readIntArray#, indexIntArray#, writeIntArray#)
+
+ ALIGNED_RW(4, readFloatArray#, indexFloatArray#, writeFloatArray#)
+ ALIGNED_RW(8, readDoubleArray#, indexDoubleArray#, writeDoubleArray#)
+
+ ALIGNED_RW(1, readCharArray#, indexCharArray#, writeCharArray#)
+ ALIGNED_RW(4, readWideCharArray#, indexWideCharArray#, writeWideCharArray#)
+
+ -- TODO: What is the right condition is for Addr# with the JS backend?
+ ALIGNED_RW(nws, readAddrArray#, indexAddrArray#, writeAddrArray#)
+ ALIGNED_RW(nws, readStablePtrArray#, indexStablePtrArray#, writeStablePtrArray#)
+
+
+ -- test valid unaligned read/write ops
+ -- (expressed via CPP macro because of non-uniform representations)
+ -- no primops for unaligned word8 access
+ UNALIGNED_RW(2, readWord8ArrayAsWord16#, indexWord8ArrayAsWord16#, writeWord8ArrayAsWord16#)
+ UNALIGNED_RW(4, readWord8ArrayAsWord32#, indexWord8ArrayAsWord32#, writeWord8ArrayAsWord32#)
+ UNALIGNED_RW(8, readWord8ArrayAsWord64#, indexWord8ArrayAsWord64#, writeWord8ArrayAsWord64#)
+ UNALIGNED_RW(nws, readWord8ArrayAsWord#, indexWord8ArrayAsWord#, writeWord8ArrayAsWord#)
+
+ -- no primops for unaligned int8 access
+ UNALIGNED_RW(2, readWord8ArrayAsInt16#, indexWord8ArrayAsInt16#, writeWord8ArrayAsInt16#)
+ UNALIGNED_RW(4, readWord8ArrayAsInt32#, indexWord8ArrayAsInt32#, writeWord8ArrayAsInt32#)
+ UNALIGNED_RW(8, readWord8ArrayAsInt64#, indexWord8ArrayAsInt64#, writeWord8ArrayAsInt64#)
+ UNALIGNED_RW(nws, readWord8ArrayAsInt#, indexWord8ArrayAsInt#, writeWord8ArrayAsInt#)
+
+ UNALIGNED_RW(4, readWord8ArrayAsFloat#, indexWord8ArrayAsFloat#, writeWord8ArrayAsFloat#)
+ UNALIGNED_RW(8, readWord8ArrayAsDouble#, indexWord8ArrayAsDouble#, writeWord8ArrayAsDouble#)
+
+ UNALIGNED_RW(1, readWord8ArrayAsChar#, indexWord8ArrayAsChar#, writeWord8ArrayAsChar#)
+ UNALIGNED_RW(4, readWord8ArrayAsWideChar#, indexWord8ArrayAsWideChar#, writeWord8ArrayAsWideChar#)
+
+ -- TODO: What is the right condition is for Addr# with the JS backend?
+ UNALIGNED_RW(nws, readWord8ArrayAsAddr#, indexWord8ArrayAsAddr#, writeWord8ArrayAsAddr#)
+ UNALIGNED_RW(nws, readWord8ArrayAsStablePtr#, indexWord8ArrayAsStablePtr#, writeWord8ArrayAsStablePtr#)
+
+
+ when (i < size `div` nws) $ do
+ let testFetchModify :: (MutableByteArray# RealWorld -> Int# -> Int#
+ -> State# RealWorld -> (# State# RealWorld, Int# #))
+ -> IO ()
+ testFetchModify op
+ = IO (\s -> case op arrU# i# 137# s of (# s', _ #) -> (# s', () #) )
+ testFetchModify fetchXorIntArray#
+ testFetchModify fetchOrIntArray#
+ testFetchModify fetchNandIntArray#
+ testFetchModify fetchAndIntArray#
+ testFetchModify fetchSubIntArray#
+ testFetchModify fetchAddIntArray#
+
+ IO $ \s0 -> case atomicReadIntArray# arrU# i# s0 of
+ (# s1, v #) -> (# atomicWriteIntArray# arrP# i# v s1, () #)
+
+
+ TEST_CAS(8, I64#, casInt64Array#)
+ TEST_CAS(4, I32#, casInt32Array#)
+ TEST_CAS(2, I16#, casInt16Array#)
+ TEST_CAS(1, I8# , casInt8Array#)
+ TEST_CAS(nws, I#, casIntArray#)
+
+
+ -- test valid range ops
+ forM_ [0..size] $ \rangeLen@(I# rangeLen#) -> do
+ let ixs | rangeLen == 0 = [-4 .. size + 4] -- empty ranges are not out-of-bounds
+ | otherwise = [0 .. size - rangeLen]
+ forM_ ixs $ \i@(I# i#) -> do
+ wrapEffect (setByteArray# arrU# i# rangeLen# 234#)
+ forM_ ixs $ \j@(I# j#) -> do
+ wrapEffect (copyMutableByteArrayNonOverlapping# arrP# i# arrU# j# rangeLen#)
+ wrapEffect (copyByteArray# arrF# i# arrU# j# rangeLen#)
+ wrapEffect (copyMutableByteArray# arrU# i# arrP# j# rangeLen#)
+ wrapEffect (copyMutableByteArray# arrU# i# arrU# j# rangeLen#)
+ case compareByteArrays# arrF# i# arrF# j# rangeLen# of
+ v -> wrapEffect (setByteArray# arrP# j# rangeLen# (v `andI#` 255#))
+ let !rangeP = bufP `plusAddr#` j#
+ wrapEffect (copyAddrToByteArray# rangeP arrU# i# rangeLen#)
+ wrapEffect (copyMutableByteArrayToAddr# arrU# i# rangeP rangeLen#)
+ wrapEffect (copyByteArrayToAddr# arrF# i# rangeP rangeLen#)
+ when (abs (i - j) >= rangeLen) $
+ wrapEffect (copyMutableByteArrayNonOverlapping# arrU# i# arrU# j# rangeLen#)
+
+
+
+data Array a = Array (Array# a)
+data MutableArray s a = MutableArray (MutableArray# s a)
+data SmallArray a = SmallArray (SmallArray# a)
+data SmallMutableArray s a = SmallMutableArray (SmallMutableArray# s a)
+
+
+testArraysOfSize :: Int -> IO ()
+testArraysOfSize (size@(I# size#)) = do
+ let mkArr v = IO $ \s0 -> case newArray# size# v s0 of
+ (# s1, newArr #) -> (# s1, MutableArray newArr #)
+ MutableArray arrM# <- mkArr 0
+ Array arrF# <- do
+ MutableArray arrToFreeze <- mkArr 0
+ forM_ [0 .. size - 1] $ \(i@(I# i#)) -> do
+ wrapEffect (writeArray# arrM# i# i)
+ wrapEffect (writeArray# arrToFreeze i# i)
+
+ IO $ \s0 -> case unsafeFreezeArray# arrToFreeze s0 of
+ (# s1, frozenArr #) -> (# s1, Array frozenArr #)
+
+ forM_ [0 .. size - 1] $ \(i@(I# i#)) -> do
+
+ -- test read/index/write
+ IO $ \s0 -> case readArray# arrM# i# s0 of
+ (# s1, vm #) -> case indexArray# arrF# i# of
+ (# vf #) -> (# writeArray# arrM# i# (vm + vf) s1, () #)
+
+ -- test casArray
+ IO $ \s0 -> case casArray# arrM# i# 0 7 s0 of
+ (# s1, _, _ #) -> (# s1, () #)
+
+ -- test valid range ops
+ forM_ [0..size] $ \rangeLen@(I# rangeLen#) -> do
+ let ixs | rangeLen == 0 = [-4 .. size + 4] -- empty ranges are not out-of-bounds
+ | otherwise = [0 .. size - rangeLen]
+ forM_ ixs $ \(i@(I# i#)) -> do
+ forM_ ixs $ \(j@(I# j#)) -> do
+ wrapEffect (copyArray# arrF# i# arrM# j# rangeLen#)
+ wrapEffect (copyMutableArray# arrM# i# arrM# j# rangeLen#)
+
+
+testSmallArraysOfSize :: Int -> IO ()
+testSmallArraysOfSize (size@(I# size#)) = do
+ let mkArr v = IO $ \s0 -> case newSmallArray# size# v s0 of
+ (# s1, newArr #) -> (# s1, SmallMutableArray newArr #)
+ SmallMutableArray arrM# <- mkArr 0
+ SmallArray arrF# <- do
+ SmallMutableArray arrToFreeze <- mkArr 0
+ forM_ [0 .. size - 1] $ \(i@(I# i#)) -> do
+ wrapEffect (writeSmallArray# arrM# i# i)
+ wrapEffect (writeSmallArray# arrToFreeze i# i)
+
+ IO $ \s0 -> case unsafeFreezeSmallArray# arrToFreeze s0 of
+ (# s1, frozenArr #) -> (# s1, SmallArray frozenArr #)
+
+ forM_ [0 .. size - 1] $ \(i@(I# i#)) -> do
+
+ -- test read/index/write
+ IO $ \s0 -> case readSmallArray# arrM# i# s0 of
+ (# s1, vm #) -> case indexSmallArray# arrF# i# of
+ (# vf #) -> (# writeSmallArray# arrM# i# (vm + vf) s1, () #)
+
+ -- test casSmallArray
+ IO $ \s0 -> case casSmallArray# arrM# i# 0 7 s0 of
+ (# s1, _, _ #) -> (# s1, () #)
+
+ -- test valid range ops
+ forM_ [0..size] $ \rangeLen@(I# rangeLen#) -> do
+ let ixs | rangeLen == 0 = [-4 .. size + 4] -- empty ranges are not out-of-bounds
+ | otherwise = [0 .. size - rangeLen]
+ forM_ ixs $ \(i@(I# i#)) -> do
+ forM_ ixs $ \(j@(I# j#)) -> do
+ wrapEffect (copySmallArray# arrF# i# arrM# j# rangeLen#)
+ wrapEffect (copySmallMutableArray# arrM# i# arrM# j# rangeLen#)
+
+
+main :: IO ()
+main = forM_ ([0..4] ++ [24..32]) $ \size -> do
+ testByteArraysOfSize size
+ testArraysOfSize size
+ testSmallArraysOfSize size
=====================================
testsuite/tests/codeGen/should_run/all.T
=====================================
@@ -229,3 +229,4 @@ test('T20640b', normal, compile_and_run, [''])
test('T22296',[only_ways(llvm_ways)
,unless(arch('x86_64'), skip)],compile_and_run,[''])
test('T22798', normal, compile_and_run, ['-fregs-graph'])
+test('CheckBoundsOK', js_broken(23123), compile_and_run, ['-fcheck-prim-bounds'])
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/220a7a48cabdcfd2ef7bf5dbe3fd6df99e8d3c5b...f7da530c80c0117d5684bb52481e4a40d7e724cc
--
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/220a7a48cabdcfd2ef7bf5dbe3fd6df99e8d3c5b...f7da530c80c0117d5684bb52481e4a40d7e724cc
You're receiving this email because of your account on gitlab.haskell.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20230404/13689535/attachment-0001.html>
More information about the ghc-commits
mailing list