[Git][ghc/ghc][wip/T17609] 7 commits: rts/linker: Fix relocation overflow in PE linker

Ben Gamari gitlab at gitlab.haskell.org
Tue Nov 10 19:25:08 UTC 2020



Ben Gamari pushed to branch wip/T17609 at Glasgow Haskell Compiler / GHC


Commits:
d445cf05 by Ben Gamari at 2020-11-10T10:26:20-05:00
rts/linker: Fix relocation overflow in PE linker

Previously the overflow check for the IMAGE_REL_AMD64_ADDR32NB
relocation failed to account for the signed nature of the value.
Specifically, the overflow check was:

    uint64_t v;
    v = S + A;
    if (v >> 32) { ... }

However, `v` ultimately needs to fit into 32-bits as a signed value.
Consequently, values `v > 2^31` in fact overflow yet this is not caught
by the existing overflow check.

Here we rewrite the overflow check to rather ensure that
`INT32_MIN <= v <= INT32_MAX`. There is now quite a bit of repetition
between the `IMAGE_REL_AMD64_REL32` and `IMAGE_REL_AMD64_ADDR32` cases
but I am leaving fixing this for future work.

This bug was first noticed by @awson.

Fixes #15808.

- - - - -
4c407f6e by Sylvain Henry at 2020-11-10T10:27:00-05:00
Export SPEC from GHC.Exts (#13681)

- - - - -
7814cd5b by David Eichmann at 2020-11-10T10:27:35-05:00
ghc-heap: expose decoding from heap representation

Co-authored-by: Sven Tennie <sven.tennie at gmail.com>
Co-authored-by: Matthew Pickering <matthewtpickering at gmail.com>
Co-authored-by: Ben Gamari <bgamari.foss at gmail.com>

- - - - -
fa344d33 by Richard Eisenberg at 2020-11-10T10:28:10-05:00
Add test case for #17186.

This got fixed sometime recently; not worth it trying to
figure out which commit.

- - - - -
2e63a0fb by David Eichmann at 2020-11-10T10:28:46-05:00
Add code comments for StgInfoTable and StgStack structs

- - - - -
811ed43f by Ben Gamari at 2020-11-10T14:24:59-05:00
nativeGen: Deduplicate DWARF strings

As noted in #17609, we previously made no attempt to deduplicate
strings. This resulted in unnecessarily long compile times and large
object files. Fix this.

Fixes #17609.

- - - - -
541815b6 by Ben Gamari at 2020-11-10T14:25:00-05:00
Add Note cross-reference for unique tag allocations

- - - - -


14 changed files:

- compiler/GHC/Builtin/Uniques.hs
- compiler/GHC/CmmToAsm/Dwarf.hs
- compiler/GHC/CmmToAsm/Dwarf/Constants.hs
- compiler/GHC/CmmToAsm/Dwarf/Types.hs
- compiler/GHC/Types/Unique.hs
- includes/rts/storage/Closures.h
- includes/rts/storage/Heap.h
- includes/rts/storage/TSO.h
- libraries/base/GHC/Exts.hs
- libraries/ghc-heap/GHC/Exts/Heap.hs
- rts/Heap.c
- rts/linker/PEi386.c
- + testsuite/tests/typecheck/should_compile/T17186.hs
- testsuite/tests/typecheck/should_compile/all.T


Changes:

=====================================
compiler/GHC/Builtin/Uniques.hs
=====================================
@@ -296,6 +296,9 @@ getTupleDataConName boxity n =
 *                                                                      *
 ************************************************************************
 
+Note [Unique tag allocation]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
 Allocation of unique supply characters:
         v,t,u : for renumbering value-, type- and usage- vars.
         B:   builtin


=====================================
compiler/GHC/CmmToAsm/Dwarf.hs
=====================================
@@ -12,6 +12,7 @@ import GHC.Cmm.DebugBlock
 import GHC.Unit.Module
 import GHC.Utils.Outputable
 import GHC.Platform
+import GHC.Types.SrcLoc
 import GHC.Types.Unique
 import GHC.Types.Unique.Supply
 
@@ -46,11 +47,12 @@ dwarfGen config modLoc us blocks = do
   compPath <- getCurrentDirectory
   let lowLabel = dblCLabel $ head procs
       highLabel = mkAsmTempEndLabel $ dblCLabel $ last procs
+      producer = dwarfStringFromString $ cProjectName ++ " " ++ cProjectVersion
       dwarfUnit = DwarfCompileUnit
         { dwChildren = map (procToDwarf config) (map stripBlocks procs)
-        , dwName = fromMaybe "" (ml_hs_file modLoc)
-        , dwCompDir = addTrailingPathSeparator compPath
-        , dwProducer = cProjectName ++ " " ++ cProjectVersion
+        , dwName = dwarfStringFromString $ fromMaybe "" (ml_hs_file modLoc)
+        , dwCompDir = dwarfStringFromString $ addTrailingPathSeparator compPath
+        , dwProducer = producer
         , dwLowLabel = lowLabel
         , dwHighLabel = highLabel
         , dwLineLabel = dwarfLineLabel
@@ -76,6 +78,9 @@ dwarfGen config modLoc us blocks = do
                      , compileUnitFooter platform unitU
                      ]
 
+  -- .debug_str section: Strings
+  let stringsSct = dwarfStringsSection platform (dwarfInfoStrings dwarfUnit)
+
   -- .debug_line section: Generated mainly by the assembler, but we
   -- need to label it
   let lineSct = dwarfLineSection platform $$
@@ -92,7 +97,7 @@ dwarfGen config modLoc us blocks = do
                | otherwise               = [DwarfARange lowLabel highLabel]
   let aranges = dwarfARangesSection platform $$ pprDwarfARanges platform aranges' unitU
 
-  return (infoSct $$ abbrevSct $$ lineSct $$ frameSct $$ aranges, us'')
+  return (infoSct $$ stringsSct $$ abbrevSct $$ lineSct $$ frameSct $$ aranges, us'')
 
 -- | Build an address range entry for one proc.
 -- With split sections, each proc needs its own entry, since they may get
@@ -177,7 +182,7 @@ parent, B.
 procToDwarf :: NCGConfig -> DebugBlock -> DwarfInfo
 procToDwarf config prc
   = DwarfSubprogram { dwChildren = map blockToDwarf (dblBlocks prc)
-                    , dwName     = case dblSourceTick prc of
+                    , dwName     = dwarfStringFromString $ case dblSourceTick prc of
                          Just s at SourceNote{} -> sourceName s
                          _otherwise -> show (dblLabel prc)
                     , dwLabel    = dblCLabel prc
@@ -208,7 +213,13 @@ blockToDwarf blk
       | otherwise                 = Nothing   -- block was optimized out
 
 tickToDwarf :: Tickish () -> [DwarfInfo]
-tickToDwarf  (SourceNote ss _) = [DwarfSrcNote ss]
+tickToDwarf  (SourceNote ss _) =
+  [DwarfSrcNote { dwSpanFile = dwarfStringFromFastString (srcSpanFile ss)
+                , dwSpanStartLine = srcSpanStartLine ss
+                , dwSpanStartCol  = srcSpanStartCol  ss
+                , dwSpanEndLine = srcSpanEndLine ss
+                , dwSpanEndCol  = srcSpanEndCol  ss
+                }]
 tickToDwarf _ = []
 
 -- | Generates the data for the debug frame section, which encodes the


=====================================
compiler/GHC/CmmToAsm/Dwarf/Constants.hs
=====================================
@@ -86,12 +86,14 @@ dW_CHILDREN_no, dW_CHILDREN_yes :: Word8
 dW_CHILDREN_no  = 0
 dW_CHILDREN_yes = 1
 
-dW_FORM_addr, dW_FORM_data2, dW_FORM_data4, dW_FORM_string, dW_FORM_flag,
+dW_FORM_addr, dW_FORM_data2, dW_FORM_data4,
+  dW_FORM_strp,dW_FORM_string, dW_FORM_flag,
   dW_FORM_block1, dW_FORM_ref4, dW_FORM_ref_addr, dW_FORM_flag_present :: Word
 dW_FORM_addr   = 0x01
 dW_FORM_data2  = 0x05
 dW_FORM_data4  = 0x06
 dW_FORM_string = 0x08
+dW_FORM_strp   = 0x0e
 dW_FORM_flag   = 0x0c
 dW_FORM_block1 = 0x0a
 dW_FORM_ref_addr     = 0x10
@@ -145,11 +147,13 @@ dW_OP_call_frame_cfa = 0x9c
 
 -- * Dwarf section declarations
 dwarfInfoSection, dwarfAbbrevSection, dwarfLineSection,
-  dwarfFrameSection, dwarfGhcSection, dwarfARangesSection :: Platform -> SDoc
+  dwarfFrameSection, dwarfStringSection,
+  dwarfGhcSection, dwarfARangesSection :: Platform -> SDoc
 dwarfInfoSection    platform = dwarfSection platform "info"
 dwarfAbbrevSection  platform = dwarfSection platform "abbrev"
 dwarfLineSection    platform = dwarfSection platform "line"
 dwarfFrameSection   platform = dwarfSection platform "frame"
+dwarfStringSection  platform = dwarfSection platform "str"
 dwarfGhcSection     platform = dwarfSection platform "ghc"
 dwarfARangesSection platform = dwarfSection platform "aranges"
 
@@ -165,11 +169,13 @@ dwarfSection platform name =
        -> text "\t.section .debug_" <> text name <> text ",\"dr\""
 
 -- * Dwarf section labels
-dwarfInfoLabel, dwarfAbbrevLabel, dwarfLineLabel, dwarfFrameLabel :: PtrString
+dwarfInfoLabel, dwarfAbbrevLabel, dwarfLineLabel, dwarfFrameLabel,
+  dwarfStringLabel :: PtrString
 dwarfInfoLabel   = sLit ".Lsection_info"
 dwarfAbbrevLabel = sLit ".Lsection_abbrev"
 dwarfLineLabel   = sLit ".Lsection_line"
 dwarfFrameLabel  = sLit ".Lsection_frame"
+dwarfStringLabel = sLit ".Lsection_str"
 
 -- | Mapping of registers to DWARF register numbers
 dwarfRegNo :: Platform -> Reg -> Word8


=====================================
compiler/GHC/CmmToAsm/Dwarf/Types.hs
=====================================
@@ -2,12 +2,19 @@
 {-# LANGUAGE FlexibleInstances #-}
 {-# LANGUAGE FlexibleContexts #-}
 {-# LANGUAGE UndecidableInstances #-}
+{-# LANGUAGE RecordWildCards #-}
 
 module GHC.CmmToAsm.Dwarf.Types
   ( -- * Dwarf information
     DwarfInfo(..)
   , pprDwarfInfo
   , pprAbbrevDecls
+  , dwarfInfoStrings
+    -- * Dwarf Strings section
+  , DwarfString
+  , dwarfStringsSection
+  , dwarfStringFromString
+  , dwarfStringFromFastString
     -- * Dwarf address range table
   , DwarfARange(..)
   , pprDwarfARanges
@@ -32,18 +39,15 @@ import GHC.Prelude
 import GHC.Cmm.DebugBlock
 import GHC.Cmm.CLabel
 import GHC.Cmm.Expr         ( GlobalReg(..) )
-import GHC.Utils.Encoding
 import GHC.Data.FastString
 import GHC.Utils.Outputable
 import GHC.Platform
 import GHC.Types.Unique
+import GHC.Types.Unique.Set
 import GHC.Platform.Reg
-import GHC.Types.SrcLoc
-import GHC.Utils.Misc
 
 import GHC.CmmToAsm.Dwarf.Constants
 
-import qualified Data.ByteString as BS
 import qualified Control.Monad.Trans.State.Strict as S
 import Control.Monad (zipWithM, join)
 import Data.Bits
@@ -53,18 +57,49 @@ import Data.Char
 
 import GHC.Platform.Regs
 
+-- | A string in the DWARF @.debug_str@ section.
+newtype DwarfString = DwarfString FastString
+
+instance Uniquable DwarfString where
+  getUnique (DwarfString fs) = getUnique fs
+
+dwarfStringFromString :: String -> DwarfString
+dwarfStringFromString = dwarfStringFromFastString . fsLit
+
+dwarfStringFromFastString :: FastString -> DwarfString
+dwarfStringFromFastString = DwarfString
+
+dwarfStringSymbol :: DwarfString -> CLabel
+dwarfStringSymbol (DwarfString fs) =
+    mkAsmTempDerivedLabel (mkAsmTempLabel fs) (fsLit "_fstr")
+
+pprDwarfString :: Platform -> DwarfString -> SDoc
+pprDwarfString plat s =
+    sectionOffset plat (pdoc plat $ dwarfStringSymbol s) (ptext dwarfStringLabel)
+
+dwarfStringsSection :: Platform -> UniqSet DwarfString -> SDoc
+dwarfStringsSection platform xs = vcat
+    [ ptext dwarfStringLabel <> colon
+    , dwarfStringSection platform
+    , vcat (map string $ nonDetEltsUniqSet xs)
+    ]
+  where
+    string :: DwarfString -> SDoc
+    string dstr@(DwarfString fstr) =
+      pdoc platform (dwarfStringSymbol dstr) <> colon $$ pprFastString fstr
+
 -- | Individual dwarf records. Each one will be encoded as an entry in
 -- the @.debug_info@ section.
 data DwarfInfo
   = DwarfCompileUnit { dwChildren :: [DwarfInfo]
-                     , dwName :: String
-                     , dwProducer :: String
-                     , dwCompDir :: String
+                     , dwName :: DwarfString
+                     , dwProducer :: DwarfString
+                     , dwCompDir :: DwarfString
                      , dwLowLabel :: CLabel
                      , dwHighLabel :: CLabel
                      , dwLineLabel :: PtrString }
   | DwarfSubprogram { dwChildren :: [DwarfInfo]
-                    , dwName :: String
+                    , dwName :: DwarfString
                     , dwLabel :: CLabel
                     , dwParent :: Maybe CLabel
                       -- ^ label of DIE belonging to the parent tick
@@ -73,9 +108,23 @@ data DwarfInfo
                , dwLabel :: CLabel
                , dwMarker :: Maybe CLabel
                }
-  | DwarfSrcNote { dwSrcSpan :: RealSrcSpan
+  | DwarfSrcNote { dwSpanFile      :: !DwarfString
+                 , dwSpanStartLine :: !Int
+                 , dwSpanStartCol  :: !Int
+                 , dwSpanEndLine   :: !Int
+                 , dwSpanEndCol    :: !Int
                  }
 
+-- | 'DwarfStrings' mentioned by the given 'DwarfInfo'.
+dwarfInfoStrings :: DwarfInfo -> UniqSet DwarfString
+dwarfInfoStrings dwinfo =
+  case dwinfo of
+    DwarfCompileUnit {..} -> mkUniqSet [dwName, dwProducer, dwCompDir] `unionUniqSets` foldMap dwarfInfoStrings dwChildren
+    DwarfSubprogram {..} -> unitUniqSet dwName `unionUniqSets` foldMap dwarfInfoStrings dwChildren
+    DwarfBlock {..} -> foldMap dwarfInfoStrings dwChildren
+    DwarfSrcNote {..} -> unitUniqSet dwSpanFile
+
+
 -- | Abbreviation codes used for encoding above records in the
 -- @.debug_info@ section.
 data DwarfAbbrev
@@ -104,7 +153,7 @@ pprAbbrevDecls platform haveDebugLine =
       -- These are shared between DwAbbrSubprogram and
       -- DwAbbrSubprogramWithParent
       subprogramAttrs =
-           [ (dW_AT_name, dW_FORM_string)
+           [ (dW_AT_name, dW_FORM_strp)
            , (dW_AT_MIPS_linkage_name, dW_FORM_string)
            , (dW_AT_external, dW_FORM_flag)
            , (dW_AT_low_pc, dW_FORM_addr)
@@ -114,10 +163,10 @@ pprAbbrevDecls platform haveDebugLine =
   in dwarfAbbrevSection platform $$
      ptext dwarfAbbrevLabel <> colon $$
      mkAbbrev DwAbbrCompileUnit dW_TAG_compile_unit dW_CHILDREN_yes
-       ([(dW_AT_name,     dW_FORM_string)
-       , (dW_AT_producer, dW_FORM_string)
+       ([(dW_AT_name,     dW_FORM_strp)
+       , (dW_AT_producer, dW_FORM_strp)
        , (dW_AT_language, dW_FORM_data4)
-       , (dW_AT_comp_dir, dW_FORM_string)
+       , (dW_AT_comp_dir, dW_FORM_strp)
        , (dW_AT_use_UTF8, dW_FORM_flag_present)  -- not represented in body
        , (dW_AT_low_pc,   dW_FORM_addr)
        , (dW_AT_high_pc,  dW_FORM_addr)
@@ -138,7 +187,7 @@ pprAbbrevDecls platform haveDebugLine =
        , (dW_AT_high_pc, dW_FORM_addr)
        ] $$
      mkAbbrev DwAbbrGhcSrcNote dW_TAG_ghc_src_note dW_CHILDREN_no
-       [ (dW_AT_ghc_span_file, dW_FORM_string)
+       [ (dW_AT_ghc_span_file, dW_FORM_strp)
        , (dW_AT_ghc_span_start_line, dW_FORM_data4)
        , (dW_AT_ghc_span_start_col, dW_FORM_data2)
        , (dW_AT_ghc_span_end_line, dW_FORM_data4)
@@ -174,10 +223,10 @@ pprDwarfInfoOpen :: Platform -> Bool -> DwarfInfo -> SDoc
 pprDwarfInfoOpen platform haveSrc (DwarfCompileUnit _ name producer compDir lowLabel
                                            highLabel lineLbl) =
   pprAbbrev DwAbbrCompileUnit
-  $$ pprString name
-  $$ pprString producer
+  $$ pprDwarfString platform name
+  $$ pprDwarfString platform producer
   $$ pprData4 dW_LANG_Haskell
-  $$ pprString compDir
+  $$ pprDwarfString platform compDir
   $$ pprWord platform (pdoc platform lowLabel)
   $$ pprWord platform (pdoc platform highLabel)
   $$ if haveSrc
@@ -186,7 +235,7 @@ pprDwarfInfoOpen platform haveSrc (DwarfCompileUnit _ name producer compDir lowL
 pprDwarfInfoOpen platform _ (DwarfSubprogram _ name label parent) =
   pdoc platform (mkAsmTempDieLabel label) <> colon
   $$ pprAbbrev abbrev
-  $$ pprString name
+  $$ pprDwarfString platform name
   $$ pprLabelString platform label
   $$ pprFlag (externallyVisibleCLabel label)
   $$ pprWord platform (pdoc platform label)
@@ -209,13 +258,13 @@ pprDwarfInfoOpen platform _ (DwarfBlock _ label (Just marker)) =
   $$ pprLabelString platform label
   $$ pprWord platform (pdoc platform marker)
   $$ pprWord platform (pdoc platform $ mkAsmTempEndLabel marker)
-pprDwarfInfoOpen _ _ (DwarfSrcNote ss) =
+pprDwarfInfoOpen platform _ (DwarfSrcNote {..}) =
   pprAbbrev DwAbbrGhcSrcNote
-  $$ pprString' (ftext $ srcSpanFile ss)
-  $$ pprData4 (fromIntegral $ srcSpanStartLine ss)
-  $$ pprHalf (fromIntegral $ srcSpanStartCol ss)
-  $$ pprData4 (fromIntegral $ srcSpanEndLine ss)
-  $$ pprHalf (fromIntegral $ srcSpanEndCol ss)
+  $$ pprDwarfString platform dwSpanFile
+  $$ pprData4 (fromIntegral dwSpanStartLine)
+  $$ pprHalf (fromIntegral dwSpanStartCol)
+  $$ pprData4 (fromIntegral dwSpanEndLine)
+  $$ pprHalf (fromIntegral dwSpanEndCol)
 
 -- | Close a DWARF info record with children
 pprDwarfInfoClose :: SDoc
@@ -584,12 +633,8 @@ pprString' :: SDoc -> SDoc
 pprString' str = text "\t.asciz \"" <> str <> char '"'
 
 -- | Generate a string constant. We take care to escape the string.
-pprString :: String -> SDoc
-pprString str
-  = pprString' $ hcat $ map escapeChar $
-    if str `lengthIs` utf8EncodedLength str
-    then str
-    else map (chr . fromIntegral) $ BS.unpack $ bytesFS $ mkFastString str
+pprFastString :: FastString -> SDoc
+pprFastString = pprString' . hcat . map escapeChar . unpackFS
 
 -- | Escape a single non-unicode character
 escapeChar :: Char -> SDoc


=====================================
compiler/GHC/Types/Unique.hs
=====================================
@@ -68,8 +68,11 @@ import Data.Bits
 *                                                                      *
 ************************************************************************
 
-The @Chars@ are ``tag letters'' that identify the @UniqueSupply at .
-Fast comparison is everything on @Uniques@:
+The @Chars@ are ``tag letters'' that identify the @UniqueSupply at . The
+allocation of these is documented in Note [Unique tag allocation] in
+GHC.Builtin.Uniques.
+
+Fast comparison is everything on @Uniques at .
 -}
 
 -- | Unique identifier.


=====================================
includes/rts/storage/Closures.h
=====================================
@@ -63,6 +63,11 @@ typedef struct {
    -------------------------------------------------------------------------- */
 
 typedef struct {
+    // If TABLES_NEXT_TO_CODE is defined, then `info` is offset by
+    // `sizeof(StgInfoTable)` and so points to the `code` field of the
+    // StgInfoTable! You may want to use `get_itbl` to get the pointer to the
+    // start of the info table. See
+    // https://gitlab.haskell.org/ghc/ghc/-/wikis/commentary/rts/storage/heap-objects#tables_next_to_code.
     const StgInfoTable* info;
 #if defined(PROFILING)
     StgProfHeader         prof;


=====================================
includes/rts/storage/Heap.h
=====================================
@@ -16,3 +16,12 @@ void heap_view_closure_ptrs_in_pap_payload(StgClosure *ptrs[], StgWord *nptrs
                         , StgClosure *fun, StgClosure **payload, StgWord size);
 
 StgWord heap_view_closureSize(StgClosure *closure);
+
+/*
+ * Collect the pointers of a closure into the given array. `size` should be
+ * large enough to hold all collected pointers e.g.
+ * `heap_view_closureSize(closure)`. Returns the number of pointers collected.
+ * The caller must ensure that `closure` is not modified (or moved by the GC)
+ * for the duration of the call to `collect_pointers`.
+ */
+StgWord collect_pointers(StgClosure *closure, StgWord size, StgClosure *ptrs[]);


=====================================
includes/rts/storage/TSO.h
=====================================
@@ -242,10 +242,22 @@ typedef struct StgTSO_ {
 
 typedef struct StgStack_ {
     StgHeader  header;
-    StgWord32  stack_size;     // stack size in *words*
+
+    /* Size of the `stack` field in *words*. This is not affected by how much of
+     * the stack space is used, nor if more stack space is linked to by an
+     * UNDERFLOW_FRAME.
+     */
+    StgWord32  stack_size;
+
     StgWord8   dirty;          // non-zero => dirty
     StgWord8   marking;        // non-zero => someone is currently marking the stack
-    StgPtr     sp;             // current stack pointer
+
+    /* Pointer to the "top" of the stack i.e. the most recently written address.
+     * The stack is filled downwards, so the "top" of the stack starts with `sp
+     * = stack + stack_size` and is decremented as the stack fills with data.
+     * See comment on "Invariants" below.
+     */
+    StgPtr     sp;
     StgWord    stack[];
 } StgStack;
 


=====================================
libraries/base/GHC/Exts.hs
=====================================
@@ -72,7 +72,7 @@ module GHC.Exts
         breakpoint, breakpointCond,
 
         -- * Ids with special behaviour
-        inline, noinline, lazy, oneShot,
+        inline, noinline, lazy, oneShot, SPEC (..),
 
         -- * Running 'RealWorld' state thread
         runRW#,


=====================================
libraries/ghc-heap/GHC/Exts/Heap.hs
=====================================
@@ -7,6 +7,9 @@
 {-# LANGUAGE FlexibleInstances #-}
 {-# LANGUAGE TypeInType #-}
 {-# LANGUAGE TypeFamilies #-}
+{-# LANGUAGE ExplicitForAll #-}
+{-# LANGUAGE RankNTypes #-}
+{-# LANGUAGE UnliftedFFITypes #-}
 
 {-|
 Module      :  GHC.Exts.Heap
@@ -25,6 +28,7 @@ module GHC.Exts.Heap (
     , ClosureType(..)
     , PrimType(..)
     , HasHeapRep(getClosureData)
+    , getClosureDataFromHeapRep
 
     -- * Info Table types
     , StgInfoTable(..)
@@ -58,7 +62,7 @@ import GHC.Exts.Heap.Utils
 
 import Control.Monad
 import Data.Bits
-import GHC.Arr
+import Foreign
 import GHC.Exts
 import GHC.Int
 import GHC.Word
@@ -66,13 +70,19 @@ import GHC.Word
 #include "ghcconfig.h"
 
 class HasHeapRep (a :: TYPE rep) where
-    getClosureData :: a -> IO Closure
+
+    -- | Decode a closure to it's heap representation ('GenClosure').
+    getClosureData
+        :: a
+        -- ^ Closure to decode.
+        -> IO Closure
+        -- ^ Heap representation of the closure.
 
 instance HasHeapRep (a :: TYPE 'LiftedRep) where
-    getClosureData = getClosure
+    getClosureData = getClosureDataFromHeapObject
 
 instance HasHeapRep (a :: TYPE 'UnliftedRep) where
-    getClosureData x = getClosure (unsafeCoerce# x)
+    getClosureData x = getClosureDataFromHeapObject (unsafeCoerce# x)
 
 instance Int# ~ a => HasHeapRep (a :: TYPE 'IntRep) where
     getClosureData x = return $
@@ -102,49 +112,84 @@ instance Double# ~ a => HasHeapRep (a :: TYPE 'DoubleRep) where
     getClosureData x = return $
         DoubleClosure { ptipe = PDouble, doubleVal = D# x }
 
--- | This returns the raw representation of the given argument. The second
--- component of the triple is the raw words of the closure on the heap, and the
--- third component is those words that are actually pointers. Once back in the
--- Haskell world, the raw words that hold pointers may be outdated after a
--- garbage collector run, but the corresponding values in 'Box's will still
--- point to the correct value.
-getClosureRaw :: a -> IO (Ptr StgInfoTable, [Word], [Box])
-getClosureRaw x = do
+-- | Get the heap representation of a closure _at this moment_, even if it is
+-- unevaluated or an indirection or other exotic stuff. Beware when passing
+-- something to this function, the same caveats as for
+-- 'GHC.Exts.Heap.Closures.asBox' apply.
+--
+-- For most use cases 'getClosureData' is an easier to use alternative.
+--
+-- Currently TSO and STACK objects will return `UnsupportedClosure`. This is
+-- because it is not memory safe to extract TSO and STACK objects (done via
+-- `unpackClosure#`). Other threads may be mutating those objects and interleave
+-- with reads in `unpackClosure#`. This is particularly problematic with STACKs
+-- where pointer values may be overwritten by non-pointer values as the
+-- corresponding haskell thread runs.
+getClosureDataFromHeapObject
+    :: a
+    -- ^ Heap object to decode.
+    -> IO Closure
+    -- ^ Heap representation of the closure.
+getClosureDataFromHeapObject x = do
     case unpackClosure# x of
--- This is a hack to cover the bootstrap compiler using the old version of
--- 'unpackClosure'. The new 'unpackClosure' return values are not merely
--- a reordering, so using the old version would not work.
-        (# iptr, dat, pointers #) -> do
-            let nelems = (I# (sizeofByteArray# dat)) `div` wORD_SIZE
-                end = fromIntegral nelems - 1
-                rawWds = [W# (indexWordArray# dat i) | I# i <- [0.. end] ]
-                pelems = I# (sizeofArray# pointers)
-                ptrList = amap' Box $ Array 0 (pelems - 1) pelems pointers
-            pure (Ptr iptr, rawWds, ptrList)
-
--- From GHC.Runtime.Heap.Inspect
-amap' :: (t -> b) -> Array Int t -> [b]
-amap' f (Array i0 i _ arr#) = map g [0 .. i - i0]
-    where g (I# i#) = case indexArray# arr# i# of
-                          (# e #) -> f e
-
--- | This function returns a parsed heap representation of the argument _at
--- this moment_, even if it is unevaluated or an indirection or other exotic
--- stuff.  Beware when passing something to this function, the same caveats as
--- for 'asBox' apply.
-getClosure :: a -> IO Closure
-getClosure x = do
-    (iptr, wds, pts) <- getClosureRaw x
-    itbl <- peekItbl iptr
-    -- The remaining words after the header
-    let rawWds = drop (closureTypeHeaderSize (tipe itbl)) wds
-    -- For data args in a pointers then non-pointers closure
-    -- This is incorrect in non pointers-first setups
-    -- not sure if that happens
-        npts = drop (closureTypeHeaderSize (tipe itbl) + length pts) wds
+#if MIN_VERSION_ghc_prim(0,5,3)
+        (# infoTableAddr, heapRep, pointersArray #) -> do
+#else
+        -- This is a hack to cover the bootstrap compiler using the old version
+        -- of 'unpackClosure'. The new 'unpackClosure' return values are not
+        -- merely a reordering, so using the old version would not work.
+        (# infoTableAddr, pointersArray, heapRep #) -> do
+#endif
+            let infoTablePtr = Ptr infoTableAddr
+                ptrList = [case indexArray# pointersArray i of
+                                (# ptr #) -> Box ptr
+                            | I# i <- [0..(I# (sizeofArray# pointersArray)) - 1]
+                            ]
+
+            infoTable <- peekItbl infoTablePtr
+            case tipe infoTable of
+                TSO   -> pure $ UnsupportedClosure infoTable
+                STACK -> pure $ UnsupportedClosure infoTable
+                _ -> getClosureDataFromHeapRep heapRep infoTablePtr ptrList
+
+-- | Convert an unpacked heap object, to a `GenClosure b`. The inputs to this
+-- function can be generated from a heap object using `unpackClosure#`.
+getClosureDataFromHeapRep
+    :: ByteArray#
+    -- ^ Heap representation of the closure as returned by `unpackClosure#`.
+    -- This includes all of the object including the header, info table
+    -- pointer, pointer data, and non-pointer data. The ByteArray# may be
+    -- pinned or unpinned.
+    -> Ptr StgInfoTable
+    -- ^ Pointer to the `StgInfoTable` of the closure, extracted from the heap
+    -- representation. The info table must not be movable by GC i.e. must be in
+    -- pinned or off-heap memory.
+    -> [b]
+    -- ^ Pointers in the payload of the closure, extracted from the heap
+    -- representation as returned by `collect_pointers()` in `Heap.c`. The type
+    -- `b` is some representation of a pointer e.g. `Any` or `Ptr Any`.
+    -> IO (GenClosure b)
+    -- ^ Heap representation of the closure.
+getClosureDataFromHeapRep heapRep infoTablePtr pts = do
+    itbl <- peekItbl infoTablePtr
+    let -- heapRep as a list of words.
+        rawHeapWords :: [Word]
+        rawHeapWords = [W# (indexWordArray# heapRep i) | I# i <- [0.. end] ]
+            where
+            nelems = (I# (sizeofByteArray# heapRep)) `div` wORD_SIZE
+            end = fromIntegral nelems - 1
+
+        -- Just the payload of rawHeapWords (no header).
+        payloadWords :: [Word]
+        payloadWords = drop (closureTypeHeaderSize (tipe itbl)) rawHeapWords
+
+        -- The non-pointer words in the payload. Only valid for closures with a
+        -- "pointers first" layout. Not valid for bit field layout.
+        npts :: [Word]
+        npts = drop (closureTypeHeaderSize (tipe itbl) + length pts) rawHeapWords
     case tipe itbl of
         t | t >= CONSTR && t <= CONSTR_NOCAF -> do
-            (p, m, n) <- dataConNames iptr
+            (p, m, n) <- dataConNames infoTablePtr
             if m == "GHC.ByteCode.Instr" && n == "BreakInfo"
               then pure $ UnsupportedClosure itbl
               else pure $ ConstrClosure itbl pts npts p m n
@@ -164,9 +209,9 @@ getClosure x = do
             unless (length pts >= 1) $
                 fail "Expected at least 1 ptr argument to AP"
             -- We expect at least the arity, n_args, and fun fields
-            unless (length rawWds >= 2) $
+            unless (length payloadWords >= 2) $
                 fail $ "Expected at least 2 raw words to AP"
-            let splitWord = rawWds !! 0
+            let splitWord = payloadWords !! 0
             pure $ APClosure itbl
 #if defined(WORDS_BIGENDIAN)
                 (fromIntegral $ shiftR splitWord (wORD_SIZE_IN_BITS `div` 2))
@@ -181,9 +226,9 @@ getClosure x = do
             unless (length pts >= 1) $
                 fail "Expected at least 1 ptr argument to PAP"
             -- We expect at least the arity, n_args, and fun fields
-            unless (length rawWds >= 2) $
+            unless (length payloadWords >= 2) $
                 fail "Expected at least 2 raw words to PAP"
-            let splitWord = rawWds !! 0
+            let splitWord = payloadWords !! 0
             pure $ PAPClosure itbl
 #if defined(WORDS_BIGENDIAN)
                 (fromIntegral $ shiftR splitWord (wORD_SIZE_IN_BITS `div` 2))
@@ -218,10 +263,10 @@ getClosure x = do
             unless (length pts >= 3) $
                 fail $ "Expected at least 3 ptr argument to BCO, found "
                         ++ show (length pts)
-            unless (length rawWds >= 4) $
+            unless (length payloadWords >= 4) $
                 fail $ "Expected at least 4 words to BCO, found "
-                        ++ show (length rawWds)
-            let splitWord = rawWds !! 3
+                        ++ show (length payloadWords)
+            let splitWord = payloadWords !! 3
             pure $ BCOClosure itbl (pts !! 0) (pts !! 1) (pts !! 2)
 #if defined(WORDS_BIGENDIAN)
                 (fromIntegral $ shiftR splitWord (wORD_SIZE_IN_BITS `div` 2))
@@ -230,27 +275,30 @@ getClosure x = do
                 (fromIntegral splitWord)
                 (fromIntegral $ shiftR splitWord (wORD_SIZE_IN_BITS `div` 2))
 #endif
-                (drop 4 rawWds)
+                (drop 4 payloadWords)
 
         ARR_WORDS -> do
-            unless (length rawWds >= 1) $
+            unless (length payloadWords >= 1) $
                 fail $ "Expected at least 1 words to ARR_WORDS, found "
-                        ++ show (length rawWds)
-            pure $ ArrWordsClosure itbl (head rawWds) (tail rawWds)
+                        ++ show (length payloadWords)
+            pure $ ArrWordsClosure itbl (head payloadWords) (tail payloadWords)
 
         t | t >= MUT_ARR_PTRS_CLEAN && t <= MUT_ARR_PTRS_FROZEN_CLEAN -> do
-            unless (length rawWds >= 2) $
+            unless (length payloadWords >= 2) $
                 fail $ "Expected at least 2 words to MUT_ARR_PTRS_* "
-                        ++ "found " ++ show (length rawWds)
-            pure $ MutArrClosure itbl (rawWds !! 0) (rawWds !! 1) pts
+                        ++ "found " ++ show (length payloadWords)
+            pure $ MutArrClosure itbl (payloadWords !! 0) (payloadWords !! 1) pts
 
         t | t >= SMALL_MUT_ARR_PTRS_CLEAN && t <= SMALL_MUT_ARR_PTRS_FROZEN_CLEAN -> do
-            unless (length rawWds >= 1) $
+            unless (length payloadWords >= 1) $
                 fail $ "Expected at least 1 word to SMALL_MUT_ARR_PTRS_* "
-                        ++ "found " ++ show (length rawWds)
-            pure $ SmallMutArrClosure itbl (rawWds !! 0) pts
+                        ++ "found " ++ show (length payloadWords)
+            pure $ SmallMutArrClosure itbl (payloadWords !! 0) pts
 
-        t | t == MUT_VAR_CLEAN || t == MUT_VAR_DIRTY ->
+        t | t == MUT_VAR_CLEAN || t == MUT_VAR_DIRTY -> do
+            unless (length pts >= 1) $
+                fail $ "Expected at least 1 words to MUT_VAR, found "
+                        ++ show (length pts)
             pure $ MutVarClosure itbl (head pts)
 
         t | t == MVAR_CLEAN || t == MVAR_DIRTY -> do
@@ -260,13 +308,12 @@ getClosure x = do
             pure $ MVarClosure itbl (pts !! 0) (pts !! 1) (pts !! 2)
 
         BLOCKING_QUEUE ->
-            pure $ OtherClosure itbl pts wds
+            pure $ OtherClosure itbl pts rawHeapWords
         --    pure $ BlockingQueueClosure itbl
         --        (pts !! 0) (pts !! 1) (pts !! 2) (pts !! 3)
 
-        --  pure $ OtherClosure itbl pts wds
+        --  pure $ OtherClosure itbl pts rawHeapWords
         --
-
         WEAK ->
             pure $ WeakClosure
                 { info = itbl


=====================================
rts/Heap.c
=====================================
@@ -76,23 +76,12 @@ void heap_view_closure_ptrs_in_pap_payload(StgClosure *ptrs[], StgWord *nptrs
     }
 }
 
-StgMutArrPtrs *heap_view_closurePtrs(Capability *cap, StgClosure *closure) {
-    ASSERT(LOOKS_LIKE_CLOSURE_PTR(closure));
-
-    StgWord size = heap_view_closureSize(closure);
-    StgWord nptrs = 0;
-    StgWord i;
-
-    // First collect all pointers here, with the comfortable memory bound
-    // of the whole closure. Afterwards we know how many pointers are in
-    // the closure and then we can allocate space on the heap and copy them
-    // there
-    StgClosure *ptrs[size];
-
+// See Heap.h
+StgWord collect_pointers(StgClosure *closure, StgWord size, StgClosure *ptrs[size]) {
     StgClosure **end;
-    StgClosure **ptr;
-
     const StgInfoTable *info = get_itbl(closure);
+    StgWord nptrs = 0;
+    StgWord i;
 
     switch (info->type) {
         case INVALID_OBJECT:
@@ -101,6 +90,7 @@ StgMutArrPtrs *heap_view_closurePtrs(Capability *cap, StgClosure *closure) {
 
         // No pointers
         case ARR_WORDS:
+        case STACK:
             break;
 
         // Default layout
@@ -123,7 +113,7 @@ StgMutArrPtrs *heap_view_closurePtrs(Capability *cap, StgClosure *closure) {
         case FUN_0_2:
         case FUN_STATIC:
             end = closure->payload + info->layout.payload.ptrs;
-            for (ptr = closure->payload; ptr < end; ptr++) {
+            for (StgClosure **ptr = closure->payload; ptr < end; ptr++) {
                 ptrs[nptrs++] = *ptr;
             }
             break;
@@ -136,7 +126,7 @@ StgMutArrPtrs *heap_view_closurePtrs(Capability *cap, StgClosure *closure) {
         case THUNK_0_2:
         case THUNK_STATIC:
             end = ((StgThunk *)closure)->payload + info->layout.payload.ptrs;
-            for (ptr = ((StgThunk *)closure)->payload; ptr < end; ptr++) {
+            for (StgClosure **ptr = ((StgThunk *)closure)->payload; ptr < end; ptr++) {
                 ptrs[nptrs++] = *ptr;
             }
             break;
@@ -228,6 +218,21 @@ StgMutArrPtrs *heap_view_closurePtrs(Capability *cap, StgClosure *closure) {
             break;
     }
 
+    return nptrs;
+}
+
+StgMutArrPtrs *heap_view_closurePtrs(Capability *cap, StgClosure *closure) {
+    ASSERT(LOOKS_LIKE_CLOSURE_PTR(closure));
+
+    StgWord size = heap_view_closureSize(closure);
+
+    // First collect all pointers here, with the comfortable memory bound
+    // of the whole closure. Afterwards we know how many pointers are in
+    // the closure and then we can allocate space on the heap and copy them
+    // there
+    StgClosure *ptrs[size];
+    StgWord nptrs = collect_pointers(closure, size, ptrs);
+
     size = nptrs + mutArrPtrsCardTableSize(nptrs);
     StgMutArrPtrs *arr =
         (StgMutArrPtrs *)allocate(cap, sizeofW(StgMutArrPtrs) + size);
@@ -236,7 +241,7 @@ StgMutArrPtrs *heap_view_closurePtrs(Capability *cap, StgClosure *closure) {
     arr->ptrs = nptrs;
     arr->size = size;
 
-    for (i = 0; i<nptrs; i++) {
+    for (StgWord i = 0; i<nptrs; i++) {
         arr->payload[i] = ptrs[i];
     }
 


=====================================
rts/linker/PEi386.c
=====================================
@@ -1952,13 +1952,15 @@ ocResolve_PEi386 ( ObjectCode* oc )
                {
                    uint64_t v;
                    v = S + A;
-                   if (v >> 32) {
+                   // N.B. in the case of the sign-extended relocations we must ensure that v
+                   // fits in a signed 32-bit value. See #15808.
+                   if (((int64_t) v > (int64_t) INT32_MAX) || ((int64_t) v < (int64_t) INT32_MIN)) {
                        copyName (getSymShortName (info, sym), oc,
                                  symbol, sizeof(symbol)-1);
                        S = makeSymbolExtra_PEi386(oc, symIndex, S, (char *)symbol);
                        /* And retry */
                        v = S + A;
-                       if (v >> 32) {
+                       if (((int64_t) v > (int64_t) INT32_MAX) || ((int64_t) v < (int64_t) INT32_MIN)) {
                            barf("IMAGE_REL_AMD64_ADDR32[NB]: High bits are set in %zx for %s",
                                 v, (char *)symbol);
                        }
@@ -1970,14 +1972,14 @@ ocResolve_PEi386 ( ObjectCode* oc )
                {
                    intptr_t v;
                    v = S + (int32_t)A - ((intptr_t)pP) - 4;
-                   if ((v > (intptr_t) INT32_MAX) || (v < (intptr_t) INT32_MIN)) {
+                   if ((v > (int64_t) INT32_MAX) || (v < (int64_t) INT32_MIN)) {
                        /* Make the trampoline then */
                        copyName (getSymShortName (info, sym),
                                  oc, symbol, sizeof(symbol)-1);
                        S = makeSymbolExtra_PEi386(oc, symIndex, S, (char *)symbol);
                        /* And retry */
                        v = S + (int32_t)A - ((intptr_t)pP) - 4;
-                       if ((v > (intptr_t) INT32_MAX) || (v < (intptr_t) INT32_MIN)) {
+                       if ((v > (int64_t) INT32_MAX) || (v < (int64_t) INT32_MIN)) {
                            barf("IMAGE_REL_AMD64_REL32: High bits are set in %zx for %s",
                                 v, (char *)symbol);
                        }


=====================================
testsuite/tests/typecheck/should_compile/T17186.hs
=====================================
@@ -0,0 +1,17 @@
+{-# LANGUAGE DataKinds              #-}
+{-# LANGUAGE FlexibleContexts       #-}
+{-# LANGUAGE PolyKinds              #-}
+{-# LANGUAGE TypeFamilyDependencies #-}
+{-# LANGUAGE TypeOperators, AllowAmbiguousTypes          #-}
+
+module T17186 where
+
+-- This test is significantly abbreviated from what was posted; see
+-- #16512 for more context.
+
+type family Dim v
+
+type family v `OfDim` (n :: Dim v) = r | r -> n
+
+(!*^) :: Dim m `OfDim` j -> Dim m `OfDim` i
+(!*^) = undefined


=====================================
testsuite/tests/typecheck/should_compile/all.T
=====================================
@@ -728,4 +728,4 @@ test('T18831', normal, compile, [''])
 test('T18920', normal, compile, [''])
 test('T15942', normal, compile, [''])
 test('ClassDefaultInHsBoot', [extra_files(['ClassDefaultInHsBootA1.hs','ClassDefaultInHsBootA2.hs','ClassDefaultInHsBootA2.hs-boot','ClassDefaultInHsBootA3.hs'])], multimod_compile, ['ClassDefaultInHsBoot', '-v0'])
-
+test('T17186', normal, compile, [''])



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/ab96a4ad5eec76235d1512d8e64ee5a013862a6a...541815b6393f15ecad603d608eb888dffb30bf8f

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/ab96a4ad5eec76235d1512d8e64ee5a013862a6a...541815b6393f15ecad603d608eb888dffb30bf8f
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20201110/8aa132b6/attachment-0001.html>


More information about the ghc-commits mailing list