[Git][ghc/ghc][wip/marge_bot_batch_merge_job] 6 commits: rts/EventLog: Place eliminate duplicate strlens

Marge Bot (@marge-bot) gitlab at gitlab.haskell.org
Wed Feb 14 21:00:09 UTC 2024



Marge Bot pushed to branch wip/marge_bot_batch_merge_job at Glasgow Haskell Compiler / GHC


Commits:
325b7613 by Ben Gamari at 2024-02-14T14:27:45-05:00
rts/EventLog: Place eliminate duplicate strlens

Previously many of the `post*` implementations would first compute the
length of the event's strings in order to determine the event length.
Later we would then end up computing the length yet again in
`postString`. Now we instead pass the string length to `postStringLen`,
avoiding the repeated work.

- - - - -
8aafa51c by Ben Gamari at 2024-02-14T14:27:46-05:00
rts/eventlog: Place upper bound on IPE string field lengths

The strings in IPE events may be of unbounded length. Limit the lengths
of these fields to 64k characters to ensure that we don't exceed the
maximum event length.

- - - - -
0e60d52c by Zubin Duggal at 2024-02-14T14:27:46-05:00
rts: drop unused postString function

- - - - -
d8d1333a by Cheng Shao at 2024-02-14T14:28:23-05:00
compiler/rts: fix wasm unreg regression

This commit fixes two wasm unreg regressions caught by a nightly
pipeline:

- Unknown stg_scheduler_loopzh symbol when compiling scheduler.cmm
- Invalid _hs_constructor(101) function name when handling ctor

- - - - -
b401ada7 by Owen Shepherd at 2024-02-14T15:59:55-05:00
feat: Add sortOn to Data.List.NonEmpty

Adds `sortOn` to `Data.List.NonEmpty`, and adds
comments describing when to use it, compared to
`sortWith` or `sortBy . comparing`.

The aim is to smooth out the API between
`Data.List`, and `Data.List.NonEmpty`.

This change has been discussed in the
[clc issue](https://github.com/haskell/core-libraries-committee/issues/227).

- - - - -
1cb6e84d by Fendor at 2024-02-14T16:00:01-05:00
Prefer RdrName over OccName for looking up locations in doc renaming step

Looking up by OccName only does not take into account when functions are
only imported in a qualified way.

Fixes issue #24294

Bump haddock submodule to include regression test

- - - - -


12 changed files:

- compiler/GHC/CmmToC.hs
- compiler/GHC/Rename/Doc.hs
- libraries/base/changelog.md
- libraries/base/src/Data/List/NonEmpty.hs
- libraries/ghc-internal/src/Data/OldList.hs
- rts/eventlog/EventLog.c
- rts/wasm/scheduler.cmm
- testsuite/tests/interface-stability/base-exports.stdout
- testsuite/tests/interface-stability/base-exports.stdout-javascript-unknown-ghcjs
- testsuite/tests/interface-stability/base-exports.stdout-mingw32
- testsuite/tests/interface-stability/base-exports.stdout-ws-32
- utils/haddock


Changes:

=====================================
compiler/GHC/CmmToC.hs
=====================================
@@ -1526,15 +1526,15 @@ pprHexVal platform w rep = parens ctype <> rawlit
 pprCtorArray :: Platform -> InitOrFini -> [CLabel] -> SDoc
 pprCtorArray platform initOrFini lbls =
        decls
-    <> text "static __attribute__((" <> attribute <> text "))"
-    <> text "void _hs_" <> attribute <> text "()"
+    <> text "static __attribute__((" <> text attribute <> text "))"
+    <> text "void _hs_" <> text suffix <> text "()"
     <> braces body
   where
     body = vcat [ pprCLabel platform lbl <> text " ();" | lbl <- lbls ]
     decls = vcat [ text "void" <+> pprCLabel platform lbl <> text " (void);" | lbl <- lbls ]
-    attribute = case initOrFini of
+    (attribute, suffix) = case initOrFini of
                   IsInitArray
                     -- See Note [JSFFI initialization] for details
-                    | ArchWasm32 <- platformArch platform -> text "constructor(101)"
-                    | otherwise -> text "constructor"
-                  IsFiniArray -> text "destructor"
+                    | ArchWasm32 <- platformArch platform -> ("constructor(101)", "constructor")
+                    | otherwise -> ("constructor", "constructor")
+                  IsFiniArray -> ("destructor", "destructor")


=====================================
compiler/GHC/Rename/Doc.hs
=====================================
@@ -40,5 +40,5 @@ rnHsDocIdentifiers :: GlobalRdrEnv
 rnHsDocIdentifiers gre_env ns =
   [ L l $ greName gre
   | L l rdr_name <- ns
-  , gre <- lookupGRE gre_env (LookupOccName (rdrNameOcc rdr_name) AllRelevantGREs)
+  , gre <- lookupGRE gre_env (LookupRdrName rdr_name AllRelevantGREs)
   ]


=====================================
libraries/base/changelog.md
=====================================
@@ -33,6 +33,8 @@
     then the constraint `DataToTag t` can always be solved.
 
     ([CLC proposal #104](https://github.com/haskell/core-libraries-committee/issues/104))
+  * Add `sortOn` to `Data.List.NonEmpty`
+    ([CLC proposal #227](https://github.com/haskell/core-libraries-committee/issues/227))
 
   * Add more instances for `Compose`: `Fractional`, `RealFrac`, `Floating`, `RealFloat` ([CLC proposal #226](https://github.com/haskell/core-libraries-committee/issues/226))
 


=====================================
libraries/base/src/Data/List/NonEmpty.hs
=====================================
@@ -45,6 +45,7 @@ module Data.List.NonEmpty (
    , uncons      -- :: NonEmpty a -> (a, Maybe (NonEmpty a))
    , unfoldr     -- :: (a -> (b, Maybe a)) -> a -> NonEmpty b
    , sort        -- :: Ord a => NonEmpty a -> NonEmpty a
+   , sortOn      -- :: Ord b => (a -> b) -> NonEmpty a -> NonEmpty a
    , reverse     -- :: NonEmpty a -> NonEmpty a
    , inits       -- :: Foldable f => f a -> NonEmpty [a]
    , inits1      -- :: NonEmpty a -> NonEmpty (NonEmpty a)
@@ -196,6 +197,39 @@ cons = (<|)
 sort :: Ord a => NonEmpty a -> NonEmpty a
 sort = lift List.sort
 
+-- | Sort a 'NonEmpty' on a user-supplied projection of its elements.
+-- See 'List.sortOn' for more detailed information.
+--
+-- ==== __Examples__
+--
+-- >>> sortOn fst $ (2, "world") :| [(4, "!"), (1, "Hello")]
+-- (1,"Hello") :| [(2,"world"),(4,"!")]
+--
+-- >>> sortOn length $ "jim" :| ["creed", "pam", "michael", "dwight", "kevin"]
+-- "jim" :| ["pam","creed","kevin","dwight","michael"]
+--
+-- ==== __Performance notes__
+--
+-- This function minimises the projections performed, by materialising
+-- the projections in an intermediate list.
+--
+-- For trivial projections, you should prefer using 'sortBy' with
+-- 'comparing', for example:
+--
+-- >>> sortBy (comparing fst) $ (3, 1) :| [(2, 2), (1, 3)]
+-- (1,3) :| [(2,2),(3,1)]
+--
+-- Or, for the exact same API as 'sortOn', you can use `sortBy . comparing`:
+--
+-- >>> (sortBy . comparing) fst $ (3, 1) :| [(2, 2), (1, 3)]
+-- (1,3) :| [(2,2),(3,1)]
+--
+-- 'sortWith' is an alias for `sortBy . comparing`.
+--
+-- @since 4.20.0.0
+sortOn :: Ord b => (a -> b) -> NonEmpty a -> NonEmpty a
+sortOn f = lift (List.sortOn f)
+
 -- | Converts a normal list to a 'NonEmpty' stream.
 --
 -- Raises an error if given an empty list.


=====================================
libraries/ghc-internal/src/Data/OldList.hs
=====================================
@@ -1798,6 +1798,22 @@ rqpart cmp x (y:ys) rle rgt r =
 -- >>> sortOn length ["jim", "creed", "pam", "michael", "dwight", "kevin"]
 -- ["jim","pam","creed","kevin","dwight","michael"]
 --
+-- ==== __Performance notes__
+--
+-- This function minimises the projections performed, by materialising
+-- the projections in an intermediate list.
+--
+-- For trivial projections, you should prefer using 'sortBy' with
+-- 'comparing', for example:
+--
+-- >>> sortBy (comparing fst) [(3, 1), (2, 2), (1, 3)]
+-- [(1,3),(2,2),(3,1)]
+--
+-- Or, for the exact same API as 'sortOn', you can use `sortBy . comparing`:
+--
+-- >>> (sortBy . comparing) fst [(3, 1), (2, 2), (1, 3)]
+-- [(1,3),(2,2),(3,1)]
+--
 -- @since 4.8.0.0
 sortOn :: Ord b => (a -> b) -> [a] -> [a]
 sortOn f =


=====================================
rts/eventlog/EventLog.c
=====================================
@@ -27,6 +27,8 @@
 #include <unistd.h>
 #endif
 
+#define MIN(x,y) ((x) < (y) ? (x) : (y))
+
 Mutex state_change_mutex;
 bool eventlog_enabled; // protected by state_change_mutex to ensure
                        // serialisation of calls to
@@ -85,6 +87,14 @@ bool eventlog_enabled; // protected by state_change_mutex to ensure
  * case is that we must ensure that the buffers of any disabled capabilities are
  * flushed, lest their events are stuck in limbo. This is achieved with a call to
  * flushLocalEventsBuf in traceCapDisable.
+ *
+ *
+ * Note [Maximum event length]
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ * The maximum length of an eventlog event is determined by the maximum event
+ * buffer size, EVENT_LOG_SIZE. We must ensure that no variable-length event
+ * exceeds this limit. For this reason we impose maximum length limits on
+ * fields which may have unbounded values.
  */
 
 static const EventLogWriter *event_log_writer = NULL;
@@ -93,6 +103,7 @@ static const EventLogWriter *event_log_writer = NULL;
 // eventlog is restarted
 static eventlog_init_func_t *eventlog_header_funcs = NULL;
 
+// See Note [Maximum event length]
 #define EVENT_LOG_SIZE 2 * (1024 * 1024) // 2MB
 
 static int flushCount = 0;
@@ -172,14 +183,13 @@ static inline void postBuf(EventsBuf *eb, const StgWord8 *buf, uint32_t size)
     eb->pos += size;
 }
 
-/* Post a null-terminated string to the event log.
- * It is the caller's responsibility to ensure that there is
- * enough room for strlen(buf)+1 bytes.
+/* Post a null-terminated string up to a given length to the event log. It is
+ * the caller's responsibility to ensure that there is enough room for
+ * len+1 bytes.
  */
-static inline void postString(EventsBuf *eb, const char *buf)
+static inline void postStringLen(EventsBuf *eb, const char *buf, StgWord len)
 {
     if (buf) {
-        const int len = strlen(buf);
         ASSERT(eb->begin + eb->size > eb->pos + len + 1);
         memcpy(eb->pos, buf, len);
         eb->pos += len;
@@ -1228,13 +1238,13 @@ void postHeapProfBegin(StgWord8 profile_id)
     postWord8(&eventBuf, profile_id);
     postWord64(&eventBuf, TimeToNS(flags->heapProfileInterval));
     postWord32(&eventBuf, getHeapProfBreakdown());
-    postString(&eventBuf, flags->modSelector);
-    postString(&eventBuf, flags->descrSelector);
-    postString(&eventBuf, flags->typeSelector);
-    postString(&eventBuf, flags->ccSelector);
-    postString(&eventBuf, flags->ccsSelector);
-    postString(&eventBuf, flags->retainerSelector);
-    postString(&eventBuf, flags->bioSelector);
+    postStringLen(&eventBuf, flags->modSelector, modSelector_len);
+    postStringLen(&eventBuf, flags->descrSelector, descrSelector_len);
+    postStringLen(&eventBuf, flags->typeSelector, typeSelector_len);
+    postStringLen(&eventBuf, flags->ccSelector, ccSelector_len);
+    postStringLen(&eventBuf, flags->ccsSelector, ccsSelector_len);
+    postStringLen(&eventBuf, flags->retainerSelector, retainerSelector_len);
+    postStringLen(&eventBuf, flags->bioSelector, bioSelector_len);
     RELEASE_LOCK(&eventBufMutex);
 }
 
@@ -1279,7 +1289,7 @@ void postHeapProfSampleString(StgWord8 profile_id,
     postPayloadSize(&eventBuf, len);
     postWord8(&eventBuf, profile_id);
     postWord64(&eventBuf, residency);
-    postString(&eventBuf, label);
+    postStringLen(&eventBuf, label, label_len);
     RELEASE_LOCK(&eventBufMutex);
 }
 
@@ -1299,9 +1309,9 @@ void postHeapProfCostCentre(StgWord32 ccID,
     postEventHeader(&eventBuf, EVENT_HEAP_PROF_COST_CENTRE);
     postPayloadSize(&eventBuf, len);
     postWord32(&eventBuf, ccID);
-    postString(&eventBuf, label);
-    postString(&eventBuf, module);
-    postString(&eventBuf, srcloc);
+    postStringLen(&eventBuf, label, label_len);
+    postStringLen(&eventBuf, module, module_len);
+    postStringLen(&eventBuf, srcloc, srcloc_len);
     postWord8(&eventBuf, is_caf);
     RELEASE_LOCK(&eventBufMutex);
 }
@@ -1373,17 +1383,20 @@ void postProfBegin(void)
 #if defined(TICKY_TICKY)
 static void postTickyCounterDef(EventsBuf *eb, StgEntCounter *p)
 {
-    StgWord len = 8 + 2 + strlen(p->arg_kinds)+1 + strlen(p->str)+1 + 8 + strlen(p->ticky_json)+1;
+    StgWord arg_kinds_len = strlen(p->arg_kinds);
+    StgWord str_len = strlen(p->str);
+    StgWord ticky_json_len = strlen(p->ticky_json);
+    StgWord len = 8 + 2 + arg_kinds_len+1 + str_len+1 + 8 + ticky_json_len+1;
     CHECK(!ensureRoomForVariableEvent(eb, len));
     postEventHeader(eb, EVENT_TICKY_COUNTER_DEF);
     postPayloadSize(eb, len);
 
     postWord64(eb, (uint64_t)((uintptr_t) p));
     postWord16(eb, (uint16_t) p->arity);
-    postString(eb, p->arg_kinds);
-    postString(eb, p->str);
+    postStringLen(eb, p->arg_kinds, arg_kinds_len);
+    postStringLen(eb, p->str, str_len);
     postWord64(eb, (W_) (INFO_PTR_TO_STRUCT(p->info)));
-    postString(eb, p->ticky_json);
+    postStringLen(eb, p->ticky_json, ticky_json_len);
 
 }
 
@@ -1428,14 +1441,16 @@ void postTickyCounterSamples(StgEntCounter *counters)
 #endif /* TICKY_TICKY */
 void postIPE(const InfoProvEnt *ipe)
 {
+    // See Note [Maximum event length].
+    const StgWord MAX_IPE_STRING_LEN = 65535;
     ACQUIRE_LOCK(&eventBufMutex);
-    StgWord table_name_len = strlen(ipe->prov.table_name);
-    StgWord closure_desc_len = strlen(ipe->prov.closure_desc);
-    StgWord ty_desc_len = strlen(ipe->prov.ty_desc);
-    StgWord label_len = strlen(ipe->prov.label);
-    StgWord module_len = strlen(ipe->prov.module);
-    StgWord src_file_len = strlen(ipe->prov.src_file);
-    StgWord src_span_len = strlen(ipe->prov.src_span);
+    StgWord table_name_len = MIN(strlen(ipe->prov.table_name), MAX_IPE_STRING_LEN);
+    StgWord closure_desc_len = MIN(strlen(ipe->prov.closure_desc), MAX_IPE_STRING_LEN);
+    StgWord ty_desc_len = MIN(strlen(ipe->prov.ty_desc), MAX_IPE_STRING_LEN);
+    StgWord label_len = MIN(strlen(ipe->prov.label), MAX_IPE_STRING_LEN);
+    StgWord module_len = MIN(strlen(ipe->prov.module), MAX_IPE_STRING_LEN);
+    StgWord src_file_len = MIN(strlen(ipe->prov.src_file), MAX_IPE_STRING_LEN);
+    StgWord src_span_len = MIN(strlen(ipe->prov.src_span), MAX_IPE_STRING_LEN);
 
     // 8 for the info word
     // 1 null after each string
@@ -1446,17 +1461,17 @@ void postIPE(const InfoProvEnt *ipe)
     postEventHeader(&eventBuf, EVENT_IPE);
     postPayloadSize(&eventBuf, len);
     postWord64(&eventBuf, (StgWord) INFO_PTR_TO_STRUCT(ipe->info));
-    postString(&eventBuf, ipe->prov.table_name);
-    postString(&eventBuf, ipe->prov.closure_desc);
-    postString(&eventBuf, ipe->prov.ty_desc);
-    postString(&eventBuf, ipe->prov.label);
-    postString(&eventBuf, ipe->prov.module);
+    postStringLen(&eventBuf, ipe->prov.table_name, table_name_len);
+    postStringLen(&eventBuf, ipe->prov.closure_desc, closure_desc_len);
+    postStringLen(&eventBuf, ipe->prov.ty_desc, ty_desc_len);
+    postStringLen(&eventBuf, ipe->prov.label, label_len);
+    postStringLen(&eventBuf, ipe->prov.module, module_len);
 
     // Manually construct the location field: "<file>:<span>\0"
     postBuf(&eventBuf, (const StgWord8*) ipe->prov.src_file, src_file_len);
     StgWord8 colon = ':';
     postBuf(&eventBuf, &colon, 1);
-    postString(&eventBuf, ipe->prov.src_span);
+    postStringLen(&eventBuf, ipe->prov.src_span, src_span_len);
 
     RELEASE_LOCK(&eventBufMutex);
 }


=====================================
rts/wasm/scheduler.cmm
=====================================
@@ -127,17 +127,6 @@ section "data" {
   stg_scheduler_loop_tid: I64 0 :: I64;
 }
 
-// After creating a new thread with only a stop frame on the stack,
-// push a stg_scheduler_loop frame to make it a scheduler thread. We
-// could omit this and use C FFI to export a Haskell function that
-// invokes the scheduler loop via a foreign import prim, but that is
-// of course less efficient.
-INFO_TABLE_RET (stg_scheduler_loop, RET_SMALL, W_ info_ptr)
-  return ()
-{
-  jump stg_scheduler_loopzh ();
-}
-
 // This always returns () in R1 at the end. If only run via a foreign
 // import prim, it's fine to not return anything, but when run via a
 // stg_scheduler_loop stack frame, then the stop frame expects a valid
@@ -193,3 +182,14 @@ cleanup:
   I64[stg_scheduler_loop_tid] = 0 :: I64;
   return (ghczmprim_GHCziTupleziPrim_Z0T_closure);
 }
+
+// After creating a new thread with only a stop frame on the stack,
+// push a stg_scheduler_loop frame to make it a scheduler thread. We
+// could omit this and use C FFI to export a Haskell function that
+// invokes the scheduler loop via a foreign import prim, but that is
+// of course less efficient.
+INFO_TABLE_RET (stg_scheduler_loop, RET_SMALL, W_ info_ptr)
+  return ()
+{
+  jump stg_scheduler_loopzh ();
+}


=====================================
testsuite/tests/interface-stability/base-exports.stdout
=====================================
@@ -1423,6 +1423,7 @@ module Data.List.NonEmpty where
   some1 :: forall (f :: * -> *) a. GHC.Base.Alternative f => f a -> f (NonEmpty a)
   sort :: forall a. GHC.Classes.Ord a => NonEmpty a -> NonEmpty a
   sortBy :: forall a. (a -> a -> GHC.Types.Ordering) -> NonEmpty a -> NonEmpty a
+  sortOn :: forall b a. GHC.Classes.Ord b => (a -> b) -> NonEmpty a -> NonEmpty a
   sortWith :: forall o a. GHC.Classes.Ord o => (a -> o) -> NonEmpty a -> NonEmpty a
   span :: forall a. (a -> GHC.Types.Bool) -> NonEmpty a -> ([a], [a])
   splitAt :: forall a. GHC.Types.Int -> NonEmpty a -> ([a], [a])


=====================================
testsuite/tests/interface-stability/base-exports.stdout-javascript-unknown-ghcjs
=====================================
@@ -1423,6 +1423,7 @@ module Data.List.NonEmpty where
   some1 :: forall (f :: * -> *) a. GHC.Base.Alternative f => f a -> f (NonEmpty a)
   sort :: forall a. GHC.Classes.Ord a => NonEmpty a -> NonEmpty a
   sortBy :: forall a. (a -> a -> GHC.Types.Ordering) -> NonEmpty a -> NonEmpty a
+  sortOn :: forall b a. GHC.Classes.Ord b => (a -> b) -> NonEmpty a -> NonEmpty a
   sortWith :: forall o a. GHC.Classes.Ord o => (a -> o) -> NonEmpty a -> NonEmpty a
   span :: forall a. (a -> GHC.Types.Bool) -> NonEmpty a -> ([a], [a])
   splitAt :: forall a. GHC.Types.Int -> NonEmpty a -> ([a], [a])


=====================================
testsuite/tests/interface-stability/base-exports.stdout-mingw32
=====================================
@@ -1423,6 +1423,7 @@ module Data.List.NonEmpty where
   some1 :: forall (f :: * -> *) a. GHC.Base.Alternative f => f a -> f (NonEmpty a)
   sort :: forall a. GHC.Classes.Ord a => NonEmpty a -> NonEmpty a
   sortBy :: forall a. (a -> a -> GHC.Types.Ordering) -> NonEmpty a -> NonEmpty a
+  sortOn :: forall b a. GHC.Classes.Ord b => (a -> b) -> NonEmpty a -> NonEmpty a
   sortWith :: forall o a. GHC.Classes.Ord o => (a -> o) -> NonEmpty a -> NonEmpty a
   span :: forall a. (a -> GHC.Types.Bool) -> NonEmpty a -> ([a], [a])
   splitAt :: forall a. GHC.Types.Int -> NonEmpty a -> ([a], [a])


=====================================
testsuite/tests/interface-stability/base-exports.stdout-ws-32
=====================================
@@ -1423,6 +1423,7 @@ module Data.List.NonEmpty where
   some1 :: forall (f :: * -> *) a. GHC.Base.Alternative f => f a -> f (NonEmpty a)
   sort :: forall a. GHC.Classes.Ord a => NonEmpty a -> NonEmpty a
   sortBy :: forall a. (a -> a -> GHC.Types.Ordering) -> NonEmpty a -> NonEmpty a
+  sortOn :: forall b a. GHC.Classes.Ord b => (a -> b) -> NonEmpty a -> NonEmpty a
   sortWith :: forall o a. GHC.Classes.Ord o => (a -> o) -> NonEmpty a -> NonEmpty a
   span :: forall a. (a -> GHC.Types.Bool) -> NonEmpty a -> ([a], [a])
   splitAt :: forall a. GHC.Types.Int -> NonEmpty a -> ([a], [a])


=====================================
utils/haddock
=====================================
@@ -1 +1 @@
-Subproject commit 65453a58185726aab95289c2da0d9fb27b7ce0af
+Subproject commit 9fcf5cf499102baf9e00986bb8b54b80ec5ffc81



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/25439a86eb11fbb3189d10d25b1080aa9ab32ca2...1cb6e84d92e11e8841eee85d5a58b9e68f24e303

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/25439a86eb11fbb3189d10d25b1080aa9ab32ca2...1cb6e84d92e11e8841eee85d5a58b9e68f24e303
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20240214/43e528c9/attachment-0001.html>


More information about the ghc-commits mailing list