[Git][ghc/ghc][wip/backports-9.6] 5 commits: rts: Introduce stgMallocAlignedBytes

Ben Gamari (@bgamari) gitlab at gitlab.haskell.org
Thu Mar 2 17:31:30 UTC 2023



Ben Gamari pushed to branch wip/backports-9.6 at Glasgow Haskell Compiler / GHC


Commits:
72087b1d by Ben Gamari at 2023-03-02T12:31:12-05:00
rts: Introduce stgMallocAlignedBytes

(cherry picked from commit eeb5bd560942a4968980fb341d9ebca33ad3302b)

- - - - -
ac7bbf64 by Ben Gamari at 2023-03-02T12:31:12-05:00
rts: Correctly align Capability allocations

Previously we failed to tell the C allocator that `Capability`s needed
to be aligned, resulting in #22965.

Fixes #22965.
Fixes #22975.

(cherry picked from commit 2cca72cd3e4de25fa81dc6fcc9979e613697a838)

- - - - -
4bda8c6c by Ben Gamari at 2023-03-02T12:31:12-05:00
rts: Drop no-alignment special case for Windows

For reasons that aren't clear, we were previously not giving Capability
the same favorable alignment on Windows that we provided on other
platforms. Fix this.

(cherry picked from commit 05c5b14c5e28c279de0d84472526eccb7f05d00a)

- - - - -
cbdc5d51 by Ben Gamari at 2023-03-02T12:31:12-05:00
nativeGen: Disable asm-shortcutting on Darwin

Asm-shortcutting may produce relative references to symbols defined in
other compilation units. This is not something that MachO relocations
support (see #21972). For this reason we disable the optimisation on
Darwin. We do so without a warning since this flag is enabled by `-O2`.

Another way to address this issue would be to rather implement a
PLT-relocatable jump-table strategy. However, this would only benefit
Darwin and does not seem worth the effort.

Closes #21972.

(cherry picked from commit 8bed166bb79445f90015757fd5baac69a7b835df)

- - - - -
fbc98e66 by Ben Gamari at 2023-03-02T12:31:12-05:00
docs/relnotes: Mention -fprefer-byte-code

Closes #23027.

- - - - -


7 changed files:

- compiler/GHC/CmmToAsm.hs
- docs/users_guide/9.6.1-notes.rst
- docs/users_guide/using-optimisation.rst
- rts/Capability.c
- rts/Capability.h
- rts/RtsUtils.c
- rts/RtsUtils.h


Changes:

=====================================
compiler/GHC/CmmToAsm.hs
=====================================
@@ -812,6 +812,19 @@ generateJumpTables ncgImpl xs = concatMap f xs
 -- -----------------------------------------------------------------------------
 -- Shortcut branches
 
+-- Note [No asm-shortcutting on Darwin]
+-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+-- Asm-shortcutting may produce relative references to symbols defined in
+-- other compilation units. This is not something that MachO relocations
+-- support (see #21972). For this reason we disable the optimisation on Darwin.
+-- We do so in the backend without a warning since this flag is enabled by
+-- `-O2`.
+--
+-- Another way to address this issue would be to rather implement a
+-- PLT-relocatable jump-table strategy. However, this would only benefit Darwin
+-- and does not seem worth the effort as this optimisation generally doesn't
+-- offer terribly great benefits.
+
 shortcutBranches
         :: forall statics instr jumpDest. (Outputable jumpDest)
         => NCGConfig
@@ -822,6 +835,8 @@ shortcutBranches
 
 shortcutBranches config ncgImpl tops weights
   | ncgEnableShortcutting config
+    -- See Note [No asm-shortcutting on Darwin]
+  , not $ osMachOTarget $ platformOS $ ncgPlatform config
   = ( map (apply_mapping ncgImpl mapping) tops'
     , shortcutWeightMap mappingBid <$!> weights )
   | otherwise


=====================================
docs/users_guide/9.6.1-notes.rst
=====================================
@@ -150,6 +150,12 @@ Compiler
   on the GHC wiki for the current status, project roadmap, build instructions
   and demos.
 
+- GHC now offers a new flag, :ghc-flag:`-fprefer-byte-code`, which instructs
+  the compiler to to use byte-code when available when loading home package
+  modules for execution (e.g. when evaluating TH splices). This avoids the
+  considerable code generation and linking costs of native code, which is often
+  unnecessary for one-off Template Haskell splices.
+
 - The :extension:`TypeInType` is now marked as deprecated. Its meaning has been included
   in :extension:`PolyKinds` and :extension:`DataKinds`.
 


=====================================
docs/users_guide/using-optimisation.rst
=====================================
@@ -262,8 +262,10 @@ by saying ``-fno-wombat``.
     of a unconditionally jump, we replace all jumps to A by jumps to the successor
     of A.
 
-    This is mostly done during Cmm passes. However this can miss corner cases. So at -O2
-    we run the pass again at the asm stage to catch these.
+    This is mostly done during Cmm passes. However this can miss corner cases.
+    So at ``-O2`` this flag runs the pass again at the assembly stage to catch
+    these. Note that due to platform limitations (:ghc-ticket:`21972`) this flag
+    does nothing on macOS.
 
 .. ghc-flag:: -fblock-layout-cfg
     :shortdesc: Use the new cfg based block layout algorithm.


=====================================
rts/Capability.c
=====================================
@@ -438,8 +438,9 @@ moreCapabilities (uint32_t from USED_IF_THREADS, uint32_t to USED_IF_THREADS)
     {
         for (uint32_t i = 0; i < to; i++) {
             if (i >= from) {
-                capabilities[i] = stgMallocBytes(sizeof(Capability),
-                                                     "moreCapabilities");
+                capabilities[i] = stgMallocAlignedBytes(sizeof(Capability),
+                                                        CAPABILITY_ALIGNMENT,
+                                                        "moreCapabilities");
                 initCapability(capabilities[i], i);
             }
         }
@@ -1274,7 +1275,7 @@ freeCapabilities (void)
         Capability *cap = getCapability(i);
         freeCapability(cap);
         if (cap != &MainCapability) {
-            stgFree(cap);
+            stgFreeAligned(cap);
         }
     }
 #else


=====================================
rts/Capability.h
=====================================
@@ -32,10 +32,8 @@
 // anything else, so round it up to a cache line size:
 #if defined(s390x_HOST_ARCH)
 #define CAPABILITY_ALIGNMENT 256
-#elif !defined(mingw32_HOST_OS)
-#define CAPABILITY_ALIGNMENT 64
 #else
-#define CAPABILITY_ALIGNMENT 1
+#define CAPABILITY_ALIGNMENT 64
 #endif
 
 /* N.B. This must be consistent with CapabilityPublic in RtsAPI.h */


=====================================
rts/RtsUtils.c
=====================================
@@ -57,9 +57,9 @@ extern char *ctime_r(const time_t *, char *);
 void *
 stgMallocBytes (size_t n, char *msg)
 {
-    void *space;
+    void *space = malloc(n);
 
-    if ((space = malloc(n)) == NULL) {
+    if (space == NULL) {
       /* Quoting POSIX.1-2008 (which says more or less the same as ISO C99):
        *
        *   "Upon successful completion with size not equal to 0, malloc() shall
@@ -128,6 +128,53 @@ stgFree(void* p)
   free(p);
 }
 
+// N.B. Allocations resulting from this function must be freed by
+// `stgFreeAligned`, not `stgFree`. This is necessary due to the properties of Windows' `_aligned_malloc`
+void *
+stgMallocAlignedBytes (size_t n, size_t align, char *msg)
+{
+    void *space;
+
+#if defined(mingw32_HOST_OS)
+    space = _aligned_malloc(n, align);
+#else
+    if (posix_memalign(&space, align, n)) {
+        space = NULL; // Allocation failed
+    }
+#endif
+
+    if (space == NULL) {
+      /* Quoting POSIX.1-2008 (which says more or less the same as ISO C99):
+       *
+       *   "Upon successful completion with size not equal to 0, malloc() shall
+       *   return a pointer to the allocated space. If size is 0, either a null
+       *   pointer or a unique pointer that can be successfully passed to free()
+       *   shall be returned. Otherwise, it shall return a null pointer and set
+       *   errno to indicate the error."
+       *
+       * Consequently, a NULL pointer being returned by `malloc()` for a 0-size
+       * allocation is *not* to be considered an error.
+       */
+      if (n == 0) return NULL;
+
+      /* don't fflush(stdout); WORKAROUND bug in Linux glibc */
+      rtsConfig.mallocFailHook((W_) n, msg);
+      stg_exit(EXIT_INTERNAL_ERROR);
+    }
+    IF_DEBUG(zero_on_gc, memset(space, 0xbb, n));
+    return space;
+}
+
+void
+stgFreeAligned (void *p)
+{
+#if defined(mingw32_HOST_OS)
+    _aligned_free(p);
+#else
+    free(p);
+#endif
+}
+
 /* -----------------------------------------------------------------------------
    Stack/heap overflow
    -------------------------------------------------------------------------- */


=====================================
rts/RtsUtils.h
=====================================
@@ -48,6 +48,10 @@ void *stgCallocBytes(size_t count, size_t size, char *msg)
 char *stgStrndup(const char *s, size_t n)
     STG_MALLOC STG_MALLOC1(stgFree);
 
+void *stgMallocAlignedBytes(size_t n, size_t align, char *msg);
+
+void stgFreeAligned(void *p);
+
 /* -----------------------------------------------------------------------------
  * Misc other utilities
  * -------------------------------------------------------------------------- */



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/c7b95eb56719904042c02a3ef84184cea84a3890...fbc98e66077b933b634bf86a8d4a739ef10ea232

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/c7b95eb56719904042c02a3ef84184cea84a3890...fbc98e66077b933b634bf86a8d4a739ef10ea232
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20230302/a56db945/attachment-0001.html>


More information about the ghc-commits mailing list