[Git][ghc/ghc][wip/marge_bot_batch_merge_job] 7 commits: Fix bounds-checking buglet in Data.Array.Byte

Marge Bot (@marge-bot) gitlab at gitlab.haskell.org
Wed Dec 7 18:20:04 UTC 2022



Marge Bot pushed to branch wip/marge_bot_batch_merge_job at Glasgow Haskell Compiler / GHC


Commits:
1d5b31af by Matthew Craven at 2022-12-07T13:19:45-05:00
Fix bounds-checking buglet in Data.Array.Byte

...another manifestation of #20851 which
I unfortunately missed in my first pass.

- - - - -
2e3aea73 by Gergő Érdi at 2022-12-07T13:19:50-05:00
Remove copy-pasted definitions of `graphFromEdgedVertices*`

- - - - -
b73a9799 by Gergő Érdi at 2022-12-07T13:19:50-05:00
Add version of `reachableGraph` that avoids loop for cyclic inputs
by building its result connected component by component

Fixes #22512

- - - - -
f75ff67b by Krzysztof Gogolewski at 2022-12-07T13:19:51-05:00
Mark Type.Reflection.Unsafe as Unsafe

This module can be used to construct ill-formed TypeReps, so it should
be Unsafe.

- - - - -
69d54941 by Ian-Woo Kim at 2022-12-07T13:19:53-05:00
Truncate eventlog event for large payload (#20221)

RTS eventlog events for postCapsetVecEvent are truncated if payload
is larger than EVENT_PAYLOAD_SIZE_MAX
Previously, postCapsetVecEvent records eventlog event with payload
of variable size larger than EVENT_PAYLOAD_SIZE_MAX (2^16) without
any validation, resulting in corrupted data.
For example, this happens when a Haskell binary is invoked with very
long command line arguments exceeding 2^16 bytes (see #20221).
Now we check the size of accumulated payload messages incrementally,
and truncate the message just before the payload size exceeds
EVENT_PAYLOAD_SIZE_MAX. RTS will warn the user with a message showing
how many arguments are truncated.

- - - - -
3c48fad8 by Cheng Shao at 2022-12-07T13:19:55-05:00
hadrian: don't add debug info to non-debug ways of rts

Hadrian used to pass -g when building all ways of rts. It makes output
binaries larger (especially so for wasm backend), and isn't needed by
most users out there, so this patch removes that flag. In case the
debug info is desired, we still pass -g3 when building the debug way,
and there's also the debug_info flavour transformer which ensures -g3
is passed for all rts ways.

- - - - -
a0a06026 by Krzysztof Gogolewski at 2022-12-07T13:19:55-05:00
Restore show (typeRep @[]) == "[]"

The Show instance for TypeRep [] has changed in 9.5 to output "List"
because the name of the type constructor changed.
This seems to be accidental and is inconsistent with TypeReps of saturated
lists, which are printed as e.g. "[Int]".
For now, I'm restoring the old behavior; in the future,
maybe we should show TypeReps without puns (List, Tuple, Type).

- - - - -


9 changed files:

- compiler/GHC/Data/Graph/Directed.hs
- hadrian/src/Settings/Packages.hs
- libraries/base/Data/Array/Byte.hs
- libraries/base/Data/Typeable/Internal.hs
- libraries/base/Type/Reflection/Unsafe.hs
- libraries/base/changelog.md
- rts/eventlog/EventLog.c
- testsuite/tests/typecheck/should_run/TestTypeableBinary.hs
- testsuite/tests/typecheck/should_run/TestTypeableBinary.stdout


Changes:

=====================================
compiler/GHC/Data/Graph/Directed.hs
=====================================
@@ -9,11 +9,11 @@ module GHC.Data.Graph.Directed (
         Graph, graphFromEdgedVerticesOrd, graphFromEdgedVerticesUniq,
         graphFromVerticesAndAdjacency,
 
-        SCC(..), Node(..), flattenSCC, flattenSCCs,
+        SCC(..), Node(..), G.flattenSCC, G.flattenSCCs,
         stronglyConnCompG,
         topologicalSortG,
         verticesG, edgesG, hasVertexG,
-        reachableG, reachablesG, transposeG, allReachable, outgoingG,
+        reachableG, reachablesG, transposeG, allReachable, allReachableCyclic, outgoingG,
         emptyG,
 
         findCycle,
@@ -58,7 +58,7 @@ import qualified Data.Map as Map
 import qualified Data.Set as Set
 
 import qualified Data.Graph as G
-import Data.Graph hiding (Graph, Edge, transposeG, reachable)
+import Data.Graph ( Vertex, Bounds, SCC(..) ) -- Used in the underlying representation
 import Data.Tree
 import GHC.Types.Unique
 import GHC.Types.Unique.FM
@@ -291,19 +291,11 @@ We use the order of nodes to normalize the order of edges.
 -}
 
 stronglyConnCompG :: Graph node -> [SCC node]
-stronglyConnCompG graph = decodeSccs graph forest
-  where forest = {-# SCC "Digraph.scc" #-} scc (gr_int_graph graph)
-
-decodeSccs :: Graph node -> Forest Vertex -> [SCC node]
-decodeSccs Graph { gr_int_graph = graph, gr_vertex_to_node = vertex_fn } forest
-  = map decode forest
-  where
-    decode (Node v []) | mentions_itself v = CyclicSCC [vertex_fn v]
-                       | otherwise         = AcyclicSCC (vertex_fn v)
-    decode other = CyclicSCC (dec other [])
-      where dec (Node v ts) vs = vertex_fn v : foldr dec vs ts
-    mentions_itself v = v `elem` (graph ! v)
+stronglyConnCompG graph = decodeSccs graph $ scc (gr_int_graph graph)
 
+decodeSccs :: Graph node -> [SCC Vertex] -> [SCC node]
+decodeSccs Graph { gr_vertex_to_node = vertex_fn }
+  = map (fmap vertex_fn)
 
 -- The following two versions are provided for backwards compatibility:
 -- See Note [Deterministic SCC]
@@ -334,7 +326,7 @@ stronglyConnCompFromEdgedVerticesOrdR
         => [Node key payload]
         -> [SCC (Node key payload)]
 stronglyConnCompFromEdgedVerticesOrdR =
-  stronglyConnCompG . graphFromEdgedVertices reduceNodesIntoVerticesOrd
+  stronglyConnCompG . graphFromEdgedVerticesOrd
 
 -- The "R" interface is used when you expect to apply SCC to
 -- (some of) the result of SCC, so you don't want to lose the dependency info
@@ -345,7 +337,7 @@ stronglyConnCompFromEdgedVerticesUniqR
         => [Node key payload]
         -> [SCC (Node key payload)]
 stronglyConnCompFromEdgedVerticesUniqR =
-  stronglyConnCompG . graphFromEdgedVertices reduceNodesIntoVerticesUniq
+  stronglyConnCompG . graphFromEdgedVerticesUniq
 
 {-
 ************************************************************************
@@ -357,7 +349,7 @@ stronglyConnCompFromEdgedVerticesUniqR =
 
 topologicalSortG :: Graph node -> [node]
 topologicalSortG graph = map (gr_vertex_to_node graph) result
-  where result = {-# SCC "Digraph.topSort" #-} topSort (gr_int_graph graph)
+  where result = {-# SCC "Digraph.topSort" #-} G.topSort (gr_int_graph graph)
 
 reachableG :: Graph node -> node -> [node]
 reachableG graph from = map (gr_vertex_to_node graph) result
@@ -377,22 +369,31 @@ reachablesG graph froms = map (gr_vertex_to_node graph) result
         vs = [ v | Just v <- map (gr_node_to_vertex graph) froms ]
 
 -- | Efficiently construct a map which maps each key to it's set of transitive
--- dependencies.
+-- dependencies. Only works on acyclic input.
 allReachable :: Ord key => Graph node -> (node -> key) -> M.Map key (S.Set key)
-allReachable (Graph g from _) conv =
-  M.fromList [(conv (from v), IS.foldr (\k vs -> conv (from k) `S.insert` vs) S.empty vs)
-             | (v, vs) <- IM.toList int_graph]
+allReachable = all_reachable reachableGraph
+
+-- | Efficiently construct a map which maps each key to it's set of transitive
+-- dependencies. Less efficient than @allReachable@, but works on cyclic input as well.
+allReachableCyclic :: Ord key => Graph node -> (node -> key) -> M.Map key (S.Set key)
+allReachableCyclic = all_reachable reachableGraphCyclic
+
+all_reachable :: Ord key => (IntGraph -> IM.IntMap IS.IntSet) -> Graph node -> (node -> key) -> M.Map key (S.Set key)
+all_reachable int_reachables (Graph g from _) keyOf =
+  M.fromList [(k, IS.foldr (\v' vs -> keyOf (from v') `S.insert` vs) S.empty vs)
+             | (v, vs) <- IM.toList int_graph
+             , let k = keyOf (from v)]
   where
-    int_graph = reachableGraph g
+    int_graph = int_reachables g
 
 hasVertexG :: Graph node -> node -> Bool
 hasVertexG graph node = isJust $ gr_node_to_vertex graph node
 
 verticesG :: Graph node -> [node]
-verticesG graph = map (gr_vertex_to_node graph) $ vertices (gr_int_graph graph)
+verticesG graph = map (gr_vertex_to_node graph) $ G.vertices (gr_int_graph graph)
 
 edgesG :: Graph node -> [Edge node]
-edgesG graph = map (\(v1, v2) -> Edge (v2n v1) (v2n v2)) $ edges (gr_int_graph graph)
+edgesG graph = map (\(v1, v2) -> Edge (v2n v1) (v2n v2)) $ G.edges (gr_int_graph graph)
   where v2n = gr_vertex_to_node graph
 
 transposeG :: Graph node -> Graph node
@@ -452,13 +453,63 @@ preorderF ts         = concatMap flatten ts
 
 -- This generalizes reachable which was found in Data.Graph
 reachable    :: IntGraph -> [Vertex] -> [Vertex]
-reachable g vs = preorderF (dfs g vs)
+reachable g vs = preorderF (G.dfs g vs)
 
 reachableGraph :: IntGraph -> IM.IntMap IS.IntSet
 reachableGraph g = res
   where
     do_one v = IS.unions (IS.fromList (g ! v) : mapMaybe (flip IM.lookup res) (g ! v))
-    res = IM.fromList [(v, do_one v) | v <- vertices g]
+    res = IM.fromList [(v, do_one v) | v <- G.vertices g]
+
+scc :: IntGraph -> [SCC Vertex]
+scc graph = map decode forest
+  where
+    forest = {-# SCC "Digraph.scc" #-} G.scc graph
+
+    decode (Node v []) | mentions_itself v = CyclicSCC [v]
+                       | otherwise         = AcyclicSCC v
+    decode other = CyclicSCC (dec other [])
+      where dec (Node v ts) vs = v : foldr dec vs ts
+    mentions_itself v = v `elem` (graph ! v)
+
+reachableGraphCyclic :: IntGraph -> IM.IntMap IS.IntSet
+reachableGraphCyclic g = foldl' add_one_comp mempty comps
+  where
+    neighboursOf v = g!v
+
+    comps = scc g
+
+    -- To avoid divergence on cyclic input, we build the result
+    -- strongly connected component by component, in topological
+    -- order. For each SCC, we know that:
+    --
+    --   * All vertices in the component can reach all other vertices
+    --     in the component ("local" reachables)
+    --
+    --   * Other reachable vertices ("remote" reachables) must come
+    --     from earlier components, either via direct neighbourhood, or
+    --     transitively from earlier reachability map
+    --
+    -- This allows us to build the extension of the reachability map
+    -- directly, without any self-reference, thereby avoiding a loop.
+    add_one_comp :: IM.IntMap IS.IntSet -> SCC Vertex -> IM.IntMap IS.IntSet
+    add_one_comp earlier (AcyclicSCC v) = IM.insert v all_remotes earlier
+      where
+        earlier_neighbours = neighboursOf v
+        earlier_further = mapMaybe (flip IM.lookup earlier) earlier_neighbours
+        all_remotes = IS.unions (IS.fromList earlier_neighbours : earlier_further)
+    add_one_comp earlier (CyclicSCC vs) = IM.union (IM.fromList [(v, local v `IS.union` all_remotes) | v <- vs]) earlier
+      where
+        all_locals = IS.fromList vs
+        local v = IS.delete v all_locals
+            -- Arguably, for a cyclic SCC we should include each
+            -- vertex in its own reachable set. However, this could
+            -- lead to a lot of extra pain in client code to avoid
+            -- looping when traversing the reachability map.
+        all_neighbours = IS.fromList (concatMap neighboursOf vs)
+        earlier_neighbours = all_neighbours IS.\\ all_locals
+        earlier_further = mapMaybe (flip IM.lookup earlier) (IS.toList earlier_neighbours)
+        all_remotes = IS.unions (earlier_neighbours : earlier_further)
 
 {-
 ************************************************************************
@@ -565,4 +616,4 @@ graphFromVerticesAndAdjacency vertices edges = Graph graph vertex_node (key_vert
         key_vertex_pair (a, b) = (expectJust "graphFromVerticesAndAdjacency" $ key_vertex a,
                                   expectJust "graphFromVerticesAndAdjacency" $ key_vertex b)
         reduced_edges = map key_vertex_pair edges
-        graph = buildG bounds reduced_edges
+        graph = G.buildG bounds reduced_edges


=====================================
hadrian/src/Settings/Packages.hs
=====================================
@@ -315,7 +315,6 @@ rtsPackageArgs = package rts ? do
           -- provide non-inlined alternatives and hence needs the function to
           -- be inlined. See https://github.com/snowleopard/hadrian/issues/90.
           , arg "-O2"
-          , arg "-g"
 
           , arg "-Irts"
           , arg $ "-I" ++ path


=====================================
libraries/base/Data/Array/Byte.hs
=====================================
@@ -101,18 +101,20 @@ byteArrayToList arr = go 0
 -- | Create a 'ByteArray' from a list of a known length. If the length
 --   of the list does not match the given length, this throws an exception.
 byteArrayFromListN :: Int -> [Word8] -> ByteArray
-byteArrayFromListN n ys = runST $ do
+byteArrayFromListN n ys
+  | n >= 0 = runST $ do
     marr <- newByteArray n
     let go !ix [] = if ix == n
           then return ()
-          else error $ "Data.Array.Byte.byteArrayFromListN: list length less than specified size"
+          else errorWithoutStackTrace $ "Data.Array.Byte.byteArrayFromListN: list length less than specified size"
         go !ix (x : xs) = if ix < n
           then do
             writeByteArray marr ix x
             go (ix + 1) xs
-          else error $ "Data.Array.Byte.byteArrayFromListN: list length greater than specified size"
+          else errorWithoutStackTrace $ "Data.Array.Byte.byteArrayFromListN: list length greater than specified size"
     go 0 ys
     unsafeFreezeByteArray marr
+  | otherwise = errorWithoutStackTrace "Data.Array.Byte.ByteArrayFromListN: specified size is negative"
 
 -- | Copy a slice of an immutable byte array to a mutable byte array.
 --


=====================================
libraries/base/Data/Typeable/Internal.hs
=====================================
@@ -836,6 +836,8 @@ instance Show (TypeRep (a :: k)) where
 showTypeable :: Int -> TypeRep (a :: k) -> ShowS
 showTypeable _ TrType = showChar '*'
 showTypeable _ rep
+  | isListTyCon tc, [] <- tys =
+    showString "[]"
   | isListTyCon tc, [ty] <- tys =
     showChar '[' . shows ty . showChar ']'
 


=====================================
libraries/base/Type/Reflection/Unsafe.hs
=====================================
@@ -12,7 +12,7 @@
 -- type representations.
 --
 -----------------------------------------------------------------------------
-{-# LANGUAGE PolyKinds, DataKinds, ScopedTypeVariables #-}
+{-# LANGUAGE PolyKinds, DataKinds, ScopedTypeVariables, Unsafe #-}
 
 module Type.Reflection.Unsafe (
       -- * Type representations


=====================================
libraries/base/changelog.md
=====================================
@@ -56,6 +56,7 @@
     `malloc` for allocation. It avoids the O(n) overhead of maintaining a list
     of individually allocated pointers as well as freeing each one of them when
     freeing a `Pool`. (#14762) (#18338)
+  * `Type.Reflection.Unsafe` is now marked as unsafe.
 
 ## 4.17.0.0 *August 2022*
 


=====================================
rts/eventlog/EventLog.c
=====================================
@@ -754,7 +754,17 @@ void postCapsetVecEvent (EventTypeNum tag,
 
     for (int i = 0; i < argc; i++) {
         // 1 + strlen to account for the trailing \0, used as separator
-        size += 1 + strlen(argv[i]);
+        int increment = 1 + strlen(argv[i]);
+        if (size + increment > EVENT_PAYLOAD_SIZE_MAX) {
+            errorBelch("Event size exceeds EVENT_PAYLOAD_SIZE_MAX, record only %"
+                       FMT_Int " out of %" FMT_Int " args",
+                       (long long) i,
+                       (long long) argc);
+            argc = i;
+            break;
+        } else {
+            size += increment;
+        }
     }
 
     ACQUIRE_LOCK(&eventBufMutex);


=====================================
testsuite/tests/typecheck/should_run/TestTypeableBinary.hs
=====================================
@@ -35,3 +35,4 @@ main = do
     testRoundtrip (typeRep :: TypeRep 5)
     testRoundtrip (typeRep :: TypeRep "hello world")
     testRoundtrip (typeRep :: TypeRep ('Just 5))
+    testRoundtrip (typeRep :: TypeRep [])


=====================================
testsuite/tests/typecheck/should_run/TestTypeableBinary.stdout
=====================================
@@ -13,3 +13,4 @@ good: Int -> Int
 good: 5
 good: "hello world"
 good: 'Just Natural 5
+good: []



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/5025fe44cfa7eac0f9f466bb723b3a7cb5e323e8...a0a060266312f6b5f725efed8b389c7e63c6b469

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/5025fe44cfa7eac0f9f466bb723b3a7cb5e323e8...a0a060266312f6b5f725efed8b389c7e63c6b469
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20221207/6c0c9174/attachment-0001.html>


More information about the ghc-commits mailing list