[Git][ghc/ghc][ghc-9.2] 14 commits: rts/m32: Fix bounds check

Zubin (@wz1000) gitlab at gitlab.haskell.org
Thu May 25 11:16:25 UTC 2023



Zubin pushed to branch ghc-9.2 at Glasgow Haskell Compiler / GHC


Commits:
22714407 by Ben Gamari at 2023-05-15T17:43:26+02:00
rts/m32: Fix bounds check

Previously we would check only that the *start* of the mapping was in
the bottom 32-bits of address space. However, we need the *entire*
mapping to be in low memory. Fix this.

Noticed by @Phyx.

(cherry picked from commit 72c1812feecd2aff2a96b629063ba90a2f4cdb7b)

- - - - -
12989f38 by Ben Gamari at 2023-05-15T17:43:27+02:00
rts/m32: Accept any address within 4GB of program text

Previously m32 would assume that the program image was located near the
start of the address space and therefore assume that it wanted pages
in the bottom 4GB of address space. Instead we now check whether they
are within 4GB of whereever the program is loaded.

This is necessary on Windows, which now tends to place the image in high
memory. The eventual goal is to use m32 to allocate memory for linker
sections on Windows.

(cherry picked from commit 2e9248b7f7f645851ceb49931d10b9c5e58d2bbb)

- - - - -
b15da5a9 by GHC GitLab CI at 2023-05-15T17:43:27+02:00
rts: Generalize mmapForLinkerMarkExecutable

Renamed to mprotectForLinker and allowed setting of arbitrary protection
modes.

(cherry picked from commit 86589b893c092ae900723e76848525f20f6cafbf)

- - - - -
aa3e6822 by GHC GitLab CI at 2023-05-15T17:43:27+02:00
rts/m32: Add consistency-checking infrastructure

This adds logic, enabled in the `-debug` RTS for checking the internal
consistency of the m32 allocator. This area has always made me a bit
nervous so this should help me sleep better at night in exchange for
very little overhead.

(cherry picked from commit 88ef270aa0cecf2463396f93a273656de9df9433)

- - - - -
4671c818 by Ben Gamari at 2023-05-15T17:43:27+02:00
rts/m32: Free large objects back to the free page pool

Not entirely convinced that this is worth doing.

(cherry picked from commit 2d6f0b17e3ce9326abd43e187910db0a5e519efa)

- - - - -
13e7ebd8 by GHC GitLab CI at 2023-05-15T17:43:27+02:00
rts/m32: Increase size of free page pool to 256 pages

(cherry picked from commit e96f50beec172f5ff95769842cb9be724363311c)

- - - - -
5c31cd4c by Ben Gamari at 2023-05-15T18:08:50+02:00
rts: Dump memory map on memory mapping failures

Fixes #20992.

(cherry picked from commit fc083b480adedf26d47f880402f111680ec34183)

- - - - -
268fbed3 by Ben Gamari at 2023-05-15T18:08:50+02:00
rts/m32: Fix assertion failure

This fixes an assertion failure in the m32 allocator due to the
imprecisely specified preconditions of `m32_allocator_push_filled_list`.
Specifically, the caller must ensure that the page type is set to filled
prior to calling `m32_allocator_push_filled_list`.

While this issue did result in an assertion failure in the debug RTS,
the issue is in fact benign.

(cherry picked from commit 37825ce283b6dbcb532f51fade090a69afc2d078)

- - - - -
c8733945 by Ben Gamari at 2023-05-16T11:02:29+02:00
rts: Rename MemoryMap.[ch] -> ReportMemoryMap.[ch]

(cherry picked from commit 3df06922f03191310ebee0547de1782eeb6bda67)

- - - - -
49e546b7 by Ben Gamari at 2023-05-16T11:10:23+02:00
rts: Move mmapForLinker and friends to linker/MMap.c

They are not particularly related to linking.

(cherry picked from commit e219ac826b05db833531028e0663f62f12eff010)

- - - - -
6deb4d0d by Ben Gamari at 2023-05-16T11:10:38+02:00
rts/linker/MMap: Use MemoryAccess in mmapForLinker

(cherry picked from commit 4d3a306dce59649b303ac7aba56758aff3dee077)

- - - - -
7bdb5766 by Ben Gamari at 2023-05-16T11:11:08+02:00
rts/linker: Catch archives masquerading as object files

Check the file's header to catch static archive bearing the `.o`
extension, as may happen on Windows after the Clang refactoring.

See #21068

- - - - -
69c02cbf by Ben Gamari at 2023-05-16T11:12:43+02:00
linker: Don't use MAP_FIXED

As noted in #21057, we really shouldn't be using MAP_FIXED. I would much
rather have the process crash with a "failed to map" error than randomly
overwrite existing mappings.

Closes #21057.

(cherry picked from commit 1db4f1fe7603c338ead0ac7e1ecfd0d8354d37bf)

- - - - -
b361bcb0 by Zubin Duggal at 2023-05-24T19:31:59+05:30
Prepare release 9.2.8

Allow metric changes for 9.2.8 as baseline is from a release pipeline

Metric Decrease:
    haddock.base
    haddock.Cabal
    haddock.compiler
Metric Increase:
    ManyAlternatives
    ManyConstructors
    T10421
    T10858
    T12227
    T12425
    T12707
    T13035
    T13253
    T13719
    T15164
    T16577
    T18304
    T18698a
    T18698b
    T3294
    T5321FD
    T5642
    T9203
    T9233
    T9630
    T9872a
    T9872b
    T9872c
    T9872d
    T14697
    T12545
    T1969
    parsing001

- - - - -


17 changed files:

- configure.ac
- + docs/users_guide/9.2.8-notes.rst
- docs/users_guide/release-notes.rst
- rts/ExecPage.c
- rts/Linker.c
- rts/LinkerInternals.h
- + rts/ReportMemoryMap.c
- + rts/ReportMemoryMap.h
- rts/linker/Elf.c
- rts/linker/LoadArchive.c
- rts/linker/M32Alloc.c
- + rts/linker/MMap.c
- + rts/linker/MMap.h
- rts/linker/MachO.c
- rts/linker/SymbolExtras.c
- rts/linker/elf_got.c
- rts/rts.cabal.in


Changes:

=====================================
configure.ac
=====================================
@@ -13,7 +13,7 @@ dnl
 # see what flags are available. (Better yet, read the documentation!)
 #
 
-AC_INIT([The Glorious Glasgow Haskell Compilation System], [9.2.7], [glasgow-haskell-bugs at haskell.org], [ghc-AC_PACKAGE_VERSION])
+AC_INIT([The Glorious Glasgow Haskell Compilation System], [9.2.8], [glasgow-haskell-bugs at haskell.org], [ghc-AC_PACKAGE_VERSION])
     # Version on HEAD must be X.Y (not X.Y.Z) for ProjectVersionMunged variable
     # to be useful (cf #19058)
 


=====================================
docs/users_guide/9.2.8-notes.rst
=====================================
@@ -0,0 +1,62 @@
+.. _release-9-2-8:
+
+Version 9.2.8
+==============
+
+The significant changes to the various parts of the compiler are listed in the
+following sections.
+
+The :ghc-flag:`LLVM backend <-fllvm>` of this release is to be used with LLVM
+9, 10, 11, or 12.
+
+Runtime system
+--------------
+
+- Fix a bug with RTS linker failing with 'internal error: m32_allocator_init:
+  Failed to map' on newer Linux kernels (:ghc-ticket:`19421`).
+
+Included libraries
+------------------
+
+The package database provided with this distribution also contains a number of
+packages other than GHC itself. See the changelogs provided with these packages
+for further change information.
+
+.. ghc-package-list::
+
+    libraries/array/array.cabal:             Dependency of ``ghc`` library
+    libraries/base/base.cabal:               Core library
+    libraries/binary/binary.cabal:           Dependency of ``ghc`` library
+    libraries/bytestring/bytestring.cabal:   Dependency of ``ghc`` library
+    libraries/Cabal/Cabal/Cabal.cabal:       Dependency of ``ghc-pkg`` utility
+    libraries/containers/containers/containers.cabal:   Dependency of ``ghc`` library
+    libraries/deepseq/deepseq.cabal:         Dependency of ``ghc`` library
+    libraries/directory/directory.cabal:     Dependency of ``ghc`` library
+    libraries/exceptions/exceptions.cabal:   Dependency of ``ghc`` and ``haskeline`` library
+    libraries/filepath/filepath.cabal:       Dependency of ``ghc`` library
+    compiler/ghc.cabal:                      The compiler itself
+    libraries/ghci/ghci.cabal:               The REPL interface
+    libraries/ghc-boot/ghc-boot.cabal:       Internal compiler library
+    libraries/ghc-boot-th/ghc-boot-th.cabal: Internal compiler library
+    libraries/ghc-compact/ghc-compact.cabal: Core library
+    libraries/ghc-heap/ghc-heap.cabal:       GHC heap-walking library
+    libraries/ghc-prim/ghc-prim.cabal:       Core library
+    libraries/haskeline/haskeline.cabal:     Dependency of ``ghci`` executable
+    libraries/hpc/hpc.cabal:                 Dependency of ``hpc`` executable
+    libraries/integer-gmp/integer-gmp.cabal: Core library
+    libraries/libiserv/libiserv.cabal:       Internal compiler library
+    libraries/mtl/mtl.cabal:                 Dependency of ``Cabal`` library
+    libraries/parsec/parsec.cabal:           Dependency of ``Cabal`` library
+    libraries/pretty/pretty.cabal:           Dependency of ``ghc`` library
+    libraries/process/process.cabal:         Dependency of ``ghc`` library
+    libraries/stm/stm.cabal:                 Dependency of ``haskeline`` library
+    libraries/template-haskell/template-haskell.cabal:     Core library
+    libraries/terminfo/terminfo.cabal:       Dependency of ``haskeline`` library
+    libraries/text/text.cabal:               Dependency of ``Cabal`` library
+    libraries/time/time.cabal:               Dependency of ``ghc`` library
+    libraries/transformers/transformers.cabal: Dependency of ``ghc`` library
+    libraries/unix/unix.cabal:               Dependency of ``ghc`` library
+    libraries/Win32/Win32.cabal:             Dependency of ``ghc`` library
+    libraries/xhtml/xhtml.cabal:             Dependency of ``haddock`` executable
+
+


=====================================
docs/users_guide/release-notes.rst
=====================================
@@ -11,3 +11,4 @@ Release notes
    9.2.5-notes
    9.2.6-notes
    9.2.7-notes
+   9.2.8-notes


=====================================
rts/ExecPage.c
=====================================
@@ -6,8 +6,8 @@
  */
 
 #include "Rts.h"
-#include "LinkerInternals.h"
 #include "sm/OSMem.h"
+#include "linker/MMap.h"
 
 ExecPage *allocateExecPage() {
     ExecPage *page = (ExecPage *) mmapAnonForLinker(getPageSize());
@@ -15,7 +15,7 @@ ExecPage *allocateExecPage() {
 }
 
 void freezeExecPage(ExecPage *page) {
-    mmapForLinkerMarkExecutable(page, getPageSize());
+    mprotectForLinker(page, getPageSize(), MEM_READ_EXECUTE);
     flushExec(getPageSize(), page);
 }
 


=====================================
rts/Linker.c
=====================================
@@ -31,8 +31,10 @@
 #include "linker/M32Alloc.h"
 #include "linker/CacheFlush.h"
 #include "linker/SymbolExtras.h"
+#include "linker/MMap.h"
 #include "PathUtils.h"
 #include "CheckUnload.h" // createOCSectionIndices
+#include "ReportMemoryMap.h"
 
 #if !defined(mingw32_HOST_OS)
 #include "posix/Signals.h"
@@ -198,63 +200,6 @@ Mutex linker_mutex;
 /* Generic wrapper function to try and Resolve and RunInit oc files */
 int ocTryLoad( ObjectCode* oc );
 
-/* Link objects into the lower 2Gb on x86_64 and AArch64.  GHC assumes the
- * small memory model on this architecture (see gcc docs,
- * -mcmodel=small).
- *
- * MAP_32BIT not available on OpenBSD/amd64
- */
-#if defined(MAP_32BIT) && (defined(x86_64_HOST_ARCH) || (defined(aarch64_TARGET_ARCH) || defined(aarch64_HOST_ARCH)))
-#define MAP_LOW_MEM
-#define TRY_MAP_32BIT MAP_32BIT
-#else
-#define TRY_MAP_32BIT 0
-#endif
-
-#if defined(aarch64_HOST_ARCH)
-// On AArch64 MAP_32BIT is not available but we are still bound by the small
-// memory model. Consequently we still try using the MAP_LOW_MEM allocation
-// strategy.
-#define MAP_LOW_MEM
-#endif
-
-/*
- * Note [MAP_LOW_MEM]
- * ~~~~~~~~~~~~~~~~~~
- * Due to the small memory model (see above), on x86_64 and AArch64 we have to
- * map all our non-PIC object files into the low 2Gb of the address space (why
- * 2Gb and not 4Gb?  Because all addresses must be reachable using a 32-bit
- * signed PC-relative offset). On x86_64 Linux we can do this using the
- * MAP_32BIT flag to mmap(), however on other OSs (e.g. *BSD, see #2063, and
- * also on Linux inside Xen, see #2512), we can't do this.  So on these
- * systems, we have to pick a base address in the low 2Gb of the address space
- * and try to allocate memory from there.
- *
- * The same holds for aarch64, where the default, even with PIC, model
- * is 4GB. The linker is free to emit AARCH64_ADR_PREL_PG_HI21
- * relocations.
- *
- * We pick a default address based on the OS, but also make this
- * configurable via an RTS flag (+RTS -xm)
- */
-
-#if (defined(aarch64_TARGET_ARCH) || defined(aarch64_HOST_ARCH))
-// Try to use stg_upd_frame_info as the base. We need to be within +-4GB of that
-// address, otherwise we violate the aarch64 memory model. Any object we load
-// can potentially reference any of the ones we bake into the binary (and list)
-// in RtsSymbols. Thus we'll need to be within +-4GB of those,
-// stg_upd_frame_info is a good candidate as it's referenced often.
-#define MMAP_32BIT_BASE_DEFAULT (void*)&stg_upd_frame_info;
-#elif defined(MAP_32BIT) || DEFAULT_LINKER_ALWAYS_PIC
-// Try to use MAP_32BIT
-#define MMAP_32BIT_BASE_DEFAULT 0
-#else
-// A guess: 1Gb.
-#define MMAP_32BIT_BASE_DEFAULT 0x40000000
-#endif
-
-static void *mmap_32bit_base = (void *)MMAP_32BIT_BASE_DEFAULT;
-
 static void ghciRemoveSymbolTable(StrHashTable *table, const SymbolName* key,
     ObjectCode *owner)
 {
@@ -1103,217 +1048,6 @@ resolveSymbolAddr (pathchar* buffer, int size,
 #endif /* OBJFORMAT_PEi386 */
 }
 
-#if defined(mingw32_HOST_OS)
-
-//
-// Returns NULL on failure.
-//
-void *
-mmapAnonForLinker (size_t bytes)
-{
-  return VirtualAlloc(NULL, bytes, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE);
-}
-
-void
-munmapForLinker (void *addr, size_t bytes, const char *caller)
-{
-  if (VirtualFree(addr, 0, MEM_RELEASE) == 0) {
-    sysErrorBelch("munmapForLinker: %s: Failed to unmap %zd bytes at %p",
-                  caller, bytes, addr);
-  }
-}
-
-void
-mmapForLinkerMarkExecutable(void *start, size_t len)
-{
-  DWORD old;
-  if (len == 0) {
-    return;
-  }
-  if (VirtualProtect(start, len, PAGE_EXECUTE_READ, &old) == 0) {
-    sysErrorBelch("mmapForLinkerMarkExecutable: failed to protect %zd bytes at %p",
-                  len, start);
-    ASSERT(false);
-  }
-}
-
-#elif RTS_LINKER_USE_MMAP
-//
-// Returns NULL on failure.
-//
-void *
-mmapForLinker (size_t bytes, uint32_t prot, uint32_t flags, int fd, int offset)
-{
-   void *map_addr = NULL;
-   void *result;
-   size_t size;
-   uint32_t tryMap32Bit = RtsFlags.MiscFlags.linkerAlwaysPic
-     ? 0
-     : TRY_MAP_32BIT;
-   static uint32_t fixed = 0;
-
-   IF_DEBUG(linker, debugBelch("mmapForLinker: start\n"));
-   size = roundUpToPage(bytes);
-
-#if defined(MAP_LOW_MEM)
-mmap_again:
-#endif
-
-   if (mmap_32bit_base != NULL) {
-       map_addr = mmap_32bit_base;
-   }
-
-   IF_DEBUG(linker,
-            debugBelch("mmapForLinker: \tprotection %#0x\n", prot));
-   IF_DEBUG(linker,
-            debugBelch("mmapForLinker: \tflags      %#0x\n",
-                       MAP_PRIVATE | tryMap32Bit | fixed | flags));
-   IF_DEBUG(linker,
-            debugBelch("mmapForLinker: \tsize       %#0zx\n", bytes));
-   IF_DEBUG(linker,
-            debugBelch("mmapForLinker: \tmap_addr   %p\n", map_addr));
-
-   result = mmap(map_addr, size, prot,
-                 MAP_PRIVATE|tryMap32Bit|fixed|flags, fd, offset);
-
-   if (result == MAP_FAILED) {
-       sysErrorBelch("mmap %" FMT_Word " bytes at %p",(W_)size,map_addr);
-       errorBelch("Try specifying an address with +RTS -xm<addr> -RTS");
-       return NULL;
-   }
-
-#if defined(MAP_LOW_MEM)
-   if (RtsFlags.MiscFlags.linkerAlwaysPic) {
-       /* make no attempt at mapping low memory if we are assuming PIC */
-   } else if (mmap_32bit_base != NULL) {
-       if (result != map_addr) {
-           if ((W_)result > 0x80000000) {
-               // oops, we were given memory over 2Gb
-               munmap(result,size);
-#if defined(freebsd_HOST_OS)  || \
-    defined(kfreebsdgnu_HOST_OS) || \
-    defined(dragonfly_HOST_OS)
-               // Some platforms require MAP_FIXED.  This is normally
-               // a bad idea, because MAP_FIXED will overwrite
-               // existing mappings.
-               fixed = MAP_FIXED;
-               goto mmap_again;
-#else
-               errorBelch("mmapForLinker: failed to mmap() memory below 2Gb; "
-                          "asked for %lu bytes at %p. "
-                          "Try specifying an address with +RTS -xm<addr> -RTS",
-                          size, map_addr);
-               return NULL;
-#endif
-           } else {
-               // hmm, we were given memory somewhere else, but it's
-               // still under 2Gb so we can use it.
-           }
-       }
-   } else {
-       if ((W_)result > 0x80000000) {
-           // oops, we were given memory over 2Gb
-           // ... try allocating memory somewhere else?;
-           debugTrace(DEBUG_linker,
-                      "MAP_32BIT didn't work; gave us %lu bytes at 0x%p",
-                      bytes, result);
-           munmap(result, size);
-
-           // Set a base address and try again... (guess: 1Gb)
-           mmap_32bit_base = (void*)0x40000000;
-           goto mmap_again;
-       }
-   }
-#elif (defined(aarch64_TARGET_ARCH) || defined(aarch64_HOST_ARCH))
-    // for aarch64 we need to make sure we stay within 4GB of the
-    // mmap_32bit_base, and we also do not want to update it.
-    if (result != map_addr) {
-        // upper limit 4GB - size of the object file - 1mb wiggle room.
-        if(llabs((uintptr_t)result - (uintptr_t)&stg_upd_frame_info) > (2<<32) - size - (2<<20)) {
-            // not within range :(
-            debugTrace(DEBUG_linker,
-                        "MAP_32BIT didn't work; gave us %lu bytes at 0x%p",
-                        bytes, result);
-            munmap(result, size);
-            // TODO: some abort/mmap_32bit_base recomputation based on
-            //       if mmap_32bit_base is changed, or still at stg_upd_frame_info
-            goto mmap_again;
-        }
-    }
-#endif
-
-    if (mmap_32bit_base != NULL) {
-       // Next time, ask for memory right after our new mapping to maximize the
-       // chance that we get low memory.
-        mmap_32bit_base = (void*) ((uintptr_t)result + size);
-    }
-
-    IF_DEBUG(linker,
-            debugBelch("mmapForLinker: mapped %" FMT_Word
-                        " bytes starting at %p\n", (W_)size, result));
-    IF_DEBUG(linker,
-             debugBelch("mmapForLinker: done\n"));
-
-    return result;
-}
-
-/*
- * Map read/write pages in low memory. Returns NULL on failure.
- */
-void *
-mmapAnonForLinker (size_t bytes)
-{
-  return mmapForLinker (bytes, PROT_READ|PROT_WRITE, MAP_ANONYMOUS, -1, 0);
-}
-
-void munmapForLinker (void *addr, size_t bytes, const char *caller)
-{
-  int r = munmap(addr, bytes);
-  if (r == -1) {
-    // Should we abort here?
-    sysErrorBelch("munmap: %s", caller);
-  }
-}
-
-/* Note [Memory protection in the linker]
- * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- * For many years the linker would simply map all of its memory
- * with PROT_READ|PROT_WRITE|PROT_EXEC. However operating systems have been
- * becoming increasingly reluctant to accept this practice (e.g. #17353,
- * #12657) and for good reason: writable code is ripe for exploitation.
- *
- * Consequently mmapForLinker now maps its memory with PROT_READ|PROT_WRITE.
- * After the linker has finished filling/relocating the mapping it must then
- * call mmapForLinkerMarkExecutable on the sections of the mapping which
- * contain executable code.
- *
- * Note that the m32 allocator handles protection of its allocations. For this
- * reason the caller to m32_alloc() must tell the allocator whether the
- * allocation needs to be executable. The caller must then ensure that they
- * call m32_allocator_flush() after they are finished filling the region, which
- * will cause the allocator to change the protection bits to
- * PROT_READ|PROT_EXEC.
- *
- */
-
-/*
- * Mark an portion of a mapping previously reserved by mmapForLinker
- * as executable (but not writable).
- */
-void mmapForLinkerMarkExecutable(void *start, size_t len)
-{
-    if (len == 0) {
-      return;
-    }
-    IF_DEBUG(linker,
-             debugBelch("mmapForLinkerMarkExecutable: protecting %" FMT_Word
-                        " bytes starting at %p\n", (W_)len, start));
-    if (mprotect(start, len, PROT_READ|PROT_EXEC) == -1) {
-       barf("mmapForLinkerMarkExecutable: mprotect: %s\n", strerror(errno));
-    }
-}
-#endif
-
 /*
  * Remove symbols from the symbol table, and free oc->symbols.
  * This operation is idempotent.
@@ -1619,10 +1353,9 @@ preloadObjectFile (pathchar *path)
     * See also the misalignment logic for darwin below.
     */
 #if defined(darwin_HOST_OS) || defined(openbsd_HOST_OS)
-   image = mmapForLinker(fileSize, PROT_READ|PROT_WRITE, MAP_PRIVATE, fd, 0);
+   image = mmapForLinker(fileSize, MEM_READ_WRITE, MAP_PRIVATE, fd, 0);
 #else
-   image = mmapForLinker(fileSize, PROT_READ|PROT_WRITE|PROT_EXEC,
-                MAP_PRIVATE, fd, 0);
+   image = mmapForLinker(fileSize, MEM_READ_WRITE_EXECUTE, MAP_PRIVATE, fd, 0);
 #endif
 
    if (image == MAP_FAILED) {
@@ -1661,7 +1394,7 @@ preloadObjectFile (pathchar *path)
 
    image = stgMallocBytes(fileSize, "loadObj(image)");
 
-#endif
+#endif /* !defined(darwin_HOST_OS) */
 
    int n;
    n = fread ( image, 1, fileSize, f );
@@ -1706,6 +1439,15 @@ static HsInt loadObj_ (pathchar *path)
        return 1; // success
    }
 
+   if (isArchive(path)) {
+       if (loadArchive_(path)) {
+            return 1; // success
+       } else {
+            IF_DEBUG(linker,
+                        debugBelch("tried and failed to load %" PATH_FMT " as an archive\n", path));
+       }
+   }
+
    ObjectCode *oc = preloadObjectFile(path);
    if (oc == NULL) return 0;
 


=====================================
rts/LinkerInternals.h
=====================================
@@ -374,11 +374,6 @@ void exitLinker( void );
 void freeObjectCode (ObjectCode *oc);
 SymbolAddr* loadSymbol(SymbolName *lbl, RtsSymbolInfo *pinfo);
 
-void *mmapAnonForLinker (size_t bytes);
-void *mmapForLinker (size_t bytes, uint32_t prot, uint32_t flags, int fd, int offset);
-void mmapForLinkerMarkExecutable (void *start, size_t len);
-void munmapForLinker (void *addr, size_t bytes, const char *caller);
-
 void addProddableBlock ( ObjectCode* oc, void* start, int size );
 void checkProddableBlock (ObjectCode *oc, void *addr, size_t size );
 void freeProddableBlocks (ObjectCode *oc);
@@ -412,6 +407,10 @@ pathchar*
 resolveSymbolAddr (pathchar* buffer, int size,
                    SymbolAddr* symbol, uintptr_t* top);
 
+/* defined in LoadArchive.c */
+bool isArchive (pathchar *path);
+HsInt loadArchive_ (pathchar *path);
+
 /*************************************************
  * Various bits of configuration
  *************************************************/
@@ -433,6 +432,7 @@ resolveSymbolAddr (pathchar* buffer, int size,
 #define USE_CONTIGUOUS_MMAP 0
 #endif
 
+
 HsInt isAlreadyLoaded( pathchar *path );
 OStatus getObjectLoadStatus_ (pathchar *path);
 HsInt loadOc( ObjectCode* oc );
@@ -444,20 +444,4 @@ ObjectCode* mkOc( ObjectType type, pathchar *path, char *image, int imageSize,
 void initSegment(Segment *s, void *start, size_t size, SegmentProt prot, int n_sections);
 void freeSegments(ObjectCode *oc);
 
-/* MAP_ANONYMOUS is MAP_ANON on some systems,
-   e.g. OS X (before Sierra), OpenBSD etc */
-#if !defined(MAP_ANONYMOUS) && defined(MAP_ANON)
-#define MAP_ANONYMOUS MAP_ANON
-#endif
-
-/* In order to simplify control flow a bit, some references to mmap-related
-   definitions are blocked off by a C-level if statement rather than a CPP-level
-   #if statement. Since those are dead branches when !RTS_LINKER_USE_MMAP, we
-   just stub out the relevant symbols here
-*/
-#if !RTS_LINKER_USE_MMAP
-#define munmap(x,y) /* nothing */
-#define MAP_ANONYMOUS 0
-#endif
-
 #include "EndPrivate.h"


=====================================
rts/ReportMemoryMap.c
=====================================
@@ -0,0 +1,138 @@
+/* -----------------------------------------------------------------------------
+ *
+ * (c) The GHC Team, 1998-2004
+ *
+ * Memory-map dumping.
+ *
+ * This is intended to be used for reporting the process memory-map
+ * in diagnostics when the RTS fails to map a block of memory.
+ *
+ * ---------------------------------------------------------------------------*/
+
+#include "PosixSource.h"
+#include "Rts.h"
+
+#include <string.h>
+
+#if defined(darwin_HOST_OS)
+#include <mach/mach.h>
+#include <mach/mach_vm.h>
+#include <mach/vm_region.h>
+#include <mach/vm_statistics.h>
+#endif
+
+#include "ReportMemoryMap.h"
+
+#if defined(mingw32_HOST_OS)
+
+void reportMemoryMap() {
+    debugBelch("\nMemory map:\n");
+    uint8_t *addr = NULL;
+    while (true) {
+        MEMORY_BASIC_INFORMATION info;
+        int res = VirtualQuery(addr, &info, sizeof(info));
+        if (!res && GetLastError() == ERROR_INVALID_PARAMETER) {
+            return;
+        } else if (!res) {
+            sysErrorBelch("VirtualQuery failed");
+            return;
+        }
+
+        if (info.State & MEM_FREE) {
+            // free range
+        } else {
+            const char *protection;
+            switch (info.Protect) {
+            case PAGE_EXECUTE:           protection = "--x"; break;
+            case PAGE_EXECUTE_READ:      protection = "r-x"; break;
+            case PAGE_EXECUTE_READWRITE: protection = "rwx"; break;
+            case PAGE_EXECUTE_WRITECOPY: protection = "rcx"; break;
+            case PAGE_NOACCESS:          protection = "---"; break;
+            case PAGE_READONLY:          protection = "r--"; break;
+            case PAGE_READWRITE:         protection = "rw-"; break;
+            case PAGE_WRITECOPY:         protection = "rc-"; break;
+            default:                     protection = "???"; break;
+            }
+
+            const char *type;
+            switch (info.Type) {
+            case MEM_IMAGE:   type = "image"; break;
+            case MEM_MAPPED:  type = "mapped"; break;
+            case MEM_PRIVATE: type = "private"; break;
+            default:          type = "unknown"; break;
+            }
+
+            debugBelch("%08llx-%08llx %8zuK %3s (%s)\n",
+                       (uintptr_t) info.BaseAddress,
+                       (uintptr_t) info.BaseAddress + info.RegionSize,
+                       (size_t) info.RegionSize,
+                       protection, type);
+        }
+        addr = (uint8_t *) info.BaseAddress + info.RegionSize;
+    }
+}
+
+#elif defined(darwin_HOST_OS)
+
+void reportMemoryMap() {
+    // Inspired by MacFUSE /proc implementation
+    debugBelch("\nMemory map:\n");
+    while (true) {
+        vm_size_t vmsize;
+        vm_address_t address;
+        vm_region_basic_info_data_t info;
+        vm_region_flavor_t flavor = VM_REGION_BASIC_INFO;
+        memory_object_name_t object;
+        mach_msg_type_number_t info_count = VM_REGION_BASIC_INFO_COUNT;
+        kern_return_t kr =
+            mach_vm_region(mach_task_self(), &address, &vmsize, flavor,
+                           (vm_region_info_t)&info, &info_count, &object);
+        if (kr == KERN_SUCCESS) {
+            debugBelch("%08lx-%08lx %8zuK %c%c%c/%c%c%c\n",
+                       address, (address + vmsize), (vmsize >> 10),
+                       (info.protection & VM_PROT_READ)        ? 'r' : '-',
+                       (info.protection & VM_PROT_WRITE)       ? 'w' : '-',
+                       (info.protection & VM_PROT_EXECUTE)     ? 'x' : '-',
+                       (info.max_protection & VM_PROT_READ)    ? 'r' : '-',
+                       (info.max_protection & VM_PROT_WRITE)   ? 'w' : '-',
+                       (info.max_protection & VM_PROT_EXECUTE) ? 'x' : '-');
+            address += vmsize;
+        } else if (kr == KERN_INVALID_ADDRESS) {
+            // We presumably reached the end of address space
+            break;
+        } else {
+            debugBelch("  Error: %s\n", mach_error_string(kr));
+            break;
+        }
+    }
+}
+
+#else
+
+// Linux et al.
+void reportMemoryMap() {
+    debugBelch("\nMemory map:\n");
+    FILE *f = fopen("/proc/self/maps", "r");
+    if (f == NULL) {
+        debugBelch("  Could not open /proc/self/maps\n");
+        return;
+    }
+
+    while (true) {
+        char buf[256];
+        size_t n = fread(buf, 1, sizeof(buf)-1, f);
+        if (n <= 0) {
+            debugBelch("  Error: %s\n", strerror(errno));
+            break;
+        }
+        buf[n] = '\0';
+        debugBelch("%s", buf);
+        if (n < sizeof(buf)-1) {
+            break;
+        }
+    }
+    debugBelch("\n");
+    fclose(f);
+}
+
+#endif


=====================================
rts/ReportMemoryMap.h
=====================================
@@ -0,0 +1,13 @@
+/* -----------------------------------------------------------------------------
+ *
+ * (c) The GHC Team, 1998-2004
+ *
+ * Memory-map dumping.
+ *
+ * This is intended to be used for reporting the process memory-map
+ * in diagnostics when the RTS fails to map a block of memory.
+ *
+ * ---------------------------------------------------------------------------*/
+
+void reportMemoryMap(void);
+


=====================================
rts/linker/Elf.c
=====================================
@@ -17,6 +17,7 @@
 #include "RtsSymbolInfo.h"
 #include "CheckUnload.h"
 #include "LinkerInternals.h"
+#include "linker/MMap.h"
 #include "linker/Elf.h"
 #include "linker/CacheFlush.h"
 #include "linker/M32Alloc.h"
@@ -652,7 +653,7 @@ mapObjectFileSection (int fd, Elf_Word offset, Elf_Word size,
 
     pageOffset = roundDownToPage(offset);
     pageSize = roundUpToPage(offset-pageOffset+size);
-    p = mmapForLinker(pageSize, PROT_READ | PROT_WRITE, 0, fd, pageOffset);
+    p = mmapForLinker(pageSize, MEM_READ_WRITE, 0, fd, pageOffset);
     if (p == NULL) return NULL;
     *mapped_size = pageSize;
     *mapped_offset = pageOffset;
@@ -1877,7 +1878,7 @@ ocMprotect_Elf( ObjectCode *oc )
             if (section->alloc != SECTION_M32) {
                 // N.B. m32 handles protection of its allocations during
                 // flushing.
-                mmapForLinkerMarkExecutable(section->mapped_start, section->mapped_size);
+                mprotectForLinker(section->mapped_start, section->mapped_size, MEM_READ_EXECUTE);
             }
             break;
         default:


=====================================
rts/linker/LoadArchive.c
=====================================
@@ -7,6 +7,7 @@
 #include "LinkerInternals.h"
 #include "CheckUnload.h" // loaded_objects, insertOCSectionIndices
 #include "linker/M32Alloc.h"
+#include "linker/MMap.h"
 
 /* Platform specific headers */
 #if defined(OBJFORMAT_PEi386)
@@ -240,7 +241,7 @@ lookupGNUArchiveIndex(int gnuFileIndexSize, char **fileName_,
     return true;
 }
 
-static HsInt loadArchive_ (pathchar *path)
+HsInt loadArchive_ (pathchar *path)
 {
     char *image = NULL;
     HsInt retcode = 0;
@@ -630,3 +631,21 @@ HsInt loadArchive (pathchar *path)
    RELEASE_LOCK(&linker_mutex);
    return r;
 }
+
+bool isArchive (pathchar *path)
+{
+    static const char ARCHIVE_HEADER[] = "!<arch>\n";
+    char buffer[10];
+    FILE *f = pathopen(path, WSTR("rb"));
+    if (f == NULL) {
+        return false;
+    }
+
+    size_t ret = fread(buffer, 1, sizeof(buffer), f);
+    if (ret < sizeof(buffer)) {
+        return false;
+    }
+    fclose(f);
+    return strncmp(ARCHIVE_HEADER, buffer, sizeof(ARCHIVE_HEADER)-1) == 0;
+}
+


=====================================
rts/linker/M32Alloc.c
=====================================
@@ -10,7 +10,8 @@
 #include "sm/OSMem.h"
 #include "RtsUtils.h"
 #include "linker/M32Alloc.h"
-#include "LinkerInternals.h"
+#include "linker/MMap.h"
+#include "ReportMemoryMap.h"
 
 #include <inttypes.h>
 #include <stdlib.h>
@@ -135,6 +136,11 @@ The allocator is *not* thread-safe.
 
 */
 
+// Enable internal consistency checking
+#if defined(DEBUG)
+#define M32_DEBUG
+#endif
+
 #define ROUND_UP(x,size) ((x + size - 1) & ~(size - 1))
 #define ROUND_DOWN(x,size) (x & ~(size - 1))
 
@@ -147,7 +153,21 @@ The allocator is *not* thread-safe.
 /* How many pages should we map at once when re-filling the free page pool? */
 #define M32_MAP_PAGES 32
 /* Upper bound on the number of pages to keep in the free page pool */
-#define M32_MAX_FREE_PAGE_POOL_SIZE 64
+#define M32_MAX_FREE_PAGE_POOL_SIZE 256
+
+/* A utility to verify that a given address is "acceptable" for use by m32. */
+static bool
+is_okay_address(void *p) {
+  int8_t *here = LINKER_LOAD_BASE;
+  ssize_t displacement = (int8_t *) p - here;
+  return (displacement > -0x7fffffff) && (displacement < 0x7fffffff);
+}
+
+enum m32_page_type {
+  FREE_PAGE,    // a page in the free page pool
+  NURSERY_PAGE, // a nursery page
+  FILLED_PAGE,  // a page on the filled list
+};
 
 /**
  * Page header
@@ -161,8 +181,7 @@ struct m32_page_t {
     // unprotected_list or protected_list are linked together with this field.
     struct {
       uint32_t size;
-      uint32_t next; // this is a m32_page_t*, truncated to 32-bits. This is safe
-                     // as we are only allocating in the bottom 32-bits
+      struct m32_page_t *next;
     } filled_page;
 
     // Pages in the small-allocation nursery encode their current allocation
@@ -174,21 +193,64 @@ struct m32_page_t {
       struct m32_page_t *next;
     } free_page;
   };
+#if defined(M32_DEBUG)
+  enum m32_page_type type;
+#endif
+  uint8_t contents[];
 };
 
+/* Consistency-checking infrastructure */
+#if defined(M32_DEBUG)
+static void ASSERT_PAGE_ALIGNED(void *page) {
+  const size_t pgsz = getPageSize();
+  if ((((uintptr_t) page) & (pgsz-1)) != 0) {
+    barf("m32: invalid page alignment");
+  }
+}
+static void ASSERT_VALID_PAGE(struct m32_page_t *page) {
+  ASSERT_PAGE_ALIGNED(page);
+  switch (page->type) {
+  case FREE_PAGE:
+  case NURSERY_PAGE:
+  case FILLED_PAGE:
+    break;
+  default:
+    barf("m32: invalid page state\n");
+  }
+}
+static void ASSERT_PAGE_TYPE(struct m32_page_t *page, enum m32_page_type ty) {
+  if (page->type != ty) { barf("m32: unexpected page type"); }
+}
+static void ASSERT_PAGE_NOT_FREE(struct m32_page_t *page) {
+  if (page->type == FREE_PAGE) { barf("m32: unexpected free page"); }
+}
+static void SET_PAGE_TYPE(struct m32_page_t *page, enum m32_page_type ty) {
+  page->type = ty;
+}
+#else
+#define ASSERT_PAGE_ALIGNED(page)
+#define ASSERT_VALID_PAGE(page)
+#define ASSERT_PAGE_NOT_FREE(page)
+#define ASSERT_PAGE_TYPE(page, ty)
+#define SET_PAGE_TYPE(page, ty)
+#endif
+
+/* Accessors */
 static void
 m32_filled_page_set_next(struct m32_page_t *page, struct m32_page_t *next)
 {
-  if (next > (struct m32_page_t *) 0xffffffff) {
-    barf("m32_filled_page_set_next: Page not in lower 32-bits");
+  ASSERT_PAGE_TYPE(page, FILLED_PAGE);
+  if (next != NULL && ! is_okay_address(next)) {
+    barf("m32_filled_page_set_next: Page %p not within 4GB of program text", next);
   }
-  page->filled_page.next = (uint32_t) (uintptr_t) next;
+  page->filled_page.next = next;
 }
 
 static struct m32_page_t *
 m32_filled_page_get_next(struct m32_page_t *page)
 {
-    return (struct m32_page_t *) (uintptr_t) page->filled_page.next;
+  ASSERT_PAGE_TYPE(page, FILLED_PAGE);
+  return (struct m32_page_t *) (uintptr_t) page->filled_page.next;
 }
 
 /**
@@ -213,21 +275,42 @@ struct m32_allocator_t {
  * We keep a small pool of free pages around to avoid fragmentation.
  */
 struct m32_page_t *m32_free_page_pool = NULL;
+/** Number of pages in free page pool */
 unsigned int m32_free_page_pool_size = 0;
-// TODO
 
 /**
- * Free a page or, if possible, place it in the free page pool.
+ * Free a filled page or, if possible, place it in the free page pool.
  */
 static void
 m32_release_page(struct m32_page_t *page)
 {
-  if (m32_free_page_pool_size < M32_MAX_FREE_PAGE_POOL_SIZE) {
-    page->free_page.next = m32_free_page_pool;
-    m32_free_page_pool = page;
-    m32_free_page_pool_size ++;
-  } else {
-    munmapForLinker((void *) page, getPageSize(), "m32_release_page");
+  // Some sanity-checking
+  ASSERT_VALID_PAGE(page);
+  ASSERT_PAGE_NOT_FREE(page);
+
+  const size_t pgsz = getPageSize();
+  ssize_t sz = page->filled_page.size;
+  IF_DEBUG(sanity, memset(page, 0xaa, sz));
+
+  // Break the page, which may be a large multi-page allocation, into
+  // individual pages for the page pool
+  while (sz > 0) {
+    if (m32_free_page_pool_size < M32_MAX_FREE_PAGE_POOL_SIZE) {
+      mprotectForLinker(page, pgsz, MEM_READ_WRITE);
+      SET_PAGE_TYPE(page, FREE_PAGE);
+      page->free_page.next = m32_free_page_pool;
+      m32_free_page_pool = page;
+      m32_free_page_pool_size ++;
+    } else {
+      break;
+    }
+    page = (struct m32_page_t *) ((uint8_t *) page + pgsz);
+    sz -= pgsz;
+  }
+
+  // The free page pool is full, release the rest back to the system
+  if (sz > 0) {
+    munmapForLinker((void *) page, ROUND_UP(sz, pgsz), "m32_release_page");
   }
 }
 
@@ -244,14 +327,18 @@ m32_alloc_page(void)
      * pages.
      */
     const size_t pgsz = getPageSize();
-    uint8_t *chunk = mmapAnonForLinker(pgsz * M32_MAP_PAGES);
-    if (chunk > (uint8_t *) 0xffffffff) {
-      barf("m32_alloc_page: failed to get allocation in lower 32-bits");
+    const size_t map_sz = pgsz * M32_MAP_PAGES;
+    uint8_t *chunk = mmapAnonForLinker(map_sz);
+    if (! is_okay_address(chunk + map_sz)) {
+      reportMemoryMap();
+      barf("m32_alloc_page: failed to allocate pages within 4GB of program text (got %p)", chunk);
     }
+    IF_DEBUG(sanity, memset(chunk, 0xaa, map_sz));
 
 #define GET_PAGE(i) ((struct m32_page_t *) (chunk + (i) * pgsz))
     for (int i=0; i < M32_MAP_PAGES; i++) {
       struct m32_page_t *page = GET_PAGE(i);
+      SET_PAGE_TYPE(page, FREE_PAGE);
       page->free_page.next = GET_PAGE(i+1);
     }
 
@@ -264,6 +351,7 @@ m32_alloc_page(void)
   struct m32_page_t *page = m32_free_page_pool;
   m32_free_page_pool = page->free_page.next;
   m32_free_page_pool_size --;
+  ASSERT_PAGE_TYPE(page, FREE_PAGE);
   return page;
 }
 
@@ -289,8 +377,9 @@ static void
 m32_allocator_unmap_list(struct m32_page_t *head)
 {
   while (head != NULL) {
+    ASSERT_VALID_PAGE(head);
     struct m32_page_t *next = m32_filled_page_get_next(head);
-    munmapForLinker((void *) head, head->filled_page.size, "m32_allocator_unmap_list");
+    m32_release_page(head);
     head = next;
   }
 }
@@ -305,10 +394,9 @@ void m32_allocator_free(m32_allocator *alloc)
   m32_allocator_unmap_list(alloc->protected_list);
 
   /* free partially-filled pages */
-  const size_t pgsz = getPageSize();
   for (int i=0; i < M32_MAX_PAGES; i++) {
     if (alloc->pages[i]) {
-      munmapForLinker(alloc->pages[i], pgsz, "m32_allocator_free");
+      m32_release_page(alloc->pages[i]);
     }
   }
 
@@ -321,6 +409,8 @@ void m32_allocator_free(m32_allocator *alloc)
 static void
 m32_allocator_push_filled_list(struct m32_page_t **head, struct m32_page_t *page)
 {
+  ASSERT_PAGE_TYPE(page, FILLED_PAGE);
+    // N.B. it's the caller's responsibility to set the pagetype to FILLED_PAGE
   m32_filled_page_set_next(page, *head);
   *head = page;
 }
@@ -347,6 +437,7 @@ m32_allocator_flush(m32_allocator *alloc) {
        m32_release_page(alloc->pages[i]);
      } else {
        // the page contains data, move it to the unprotected list
+       SET_PAGE_TYPE(alloc->pages[i], FILLED_PAGE);
        m32_allocator_push_filled_list(&alloc->unprotected_list, alloc->pages[i]);
      }
      alloc->pages[i] = NULL;
@@ -356,9 +447,10 @@ m32_allocator_flush(m32_allocator *alloc) {
    if (alloc->executable) {
      struct m32_page_t *page = alloc->unprotected_list;
      while (page != NULL) {
+       ASSERT_PAGE_TYPE(page, FILLED_PAGE);
        struct m32_page_t *next = m32_filled_page_get_next(page);
        m32_allocator_push_filled_list(&alloc->protected_list, page);
-       mmapForLinkerMarkExecutable(page, page->filled_page.size);
+       mprotectForLinker(page, page->filled_page.size, MEM_READ_EXECUTE);
        page = next;
      }
      alloc->unprotected_list = NULL;
@@ -392,10 +484,12 @@ m32_alloc(struct m32_allocator_t *alloc, size_t size, size_t alignment)
       if (page == NULL) {
           sysErrorBelch("m32_alloc: Failed to map pages for %zd bytes", size);
           return NULL;
-      } else if (page > (struct m32_page_t *) 0xffffffff) {
-          debugBelch("m32_alloc: warning: Allocation of %zd bytes resulted in pages above 4GB (%p)",
-                     size, page);
+      } else if (! is_okay_address(page)) {
+          reportMemoryMap();
+          barf("m32_alloc: warning: Allocation of %zd bytes resulted in pages above 4GB (%p)",
+               size, page);
       }
+      SET_PAGE_TYPE(page, FILLED_PAGE);
       page->filled_page.size = alsize + size;
       m32_allocator_push_filled_list(&alloc->unprotected_list, (struct m32_page_t *) page);
       return (char*) page + alsize;
@@ -414,6 +508,8 @@ m32_alloc(struct m32_allocator_t *alloc, size_t size, size_t alignment)
       }
 
       // page can contain the buffer?
+      ASSERT_VALID_PAGE(alloc->pages[i]);
+      ASSERT_PAGE_TYPE(alloc->pages[i], NURSERY_PAGE);
       size_t alsize = ROUND_UP(alloc->pages[i]->current_size, alignment);
       if (size <= pgsz - alsize) {
          void * addr = (char*)alloc->pages[i] + alsize;
@@ -431,6 +527,7 @@ m32_alloc(struct m32_allocator_t *alloc, size_t size, size_t alignment)
 
    // If we haven't found an empty page, flush the most filled one
    if (empty == -1) {
+      SET_PAGE_TYPE(alloc->pages[most_filled], FILLED_PAGE);
       m32_allocator_push_filled_list(&alloc->unprotected_list, alloc->pages[most_filled]);
       alloc->pages[most_filled] = NULL;
       empty = most_filled;
@@ -441,6 +538,7 @@ m32_alloc(struct m32_allocator_t *alloc, size_t size, size_t alignment)
    if (page == NULL) {
       return NULL;
    }
+   SET_PAGE_TYPE(page, NURSERY_PAGE);
    alloc->pages[empty]               = page;
    // Add header size and padding
    alloc->pages[empty]->current_size =


=====================================
rts/linker/MMap.c
=====================================
@@ -0,0 +1,305 @@
+#include "Rts.h"
+
+#include "sm/OSMem.h"
+#include "linker/MMap.h"
+#include "Trace.h"
+#include "ReportMemoryMap.h"
+
+#if RTS_LINKER_USE_MMAP
+#include <sys/mman.h>
+#endif
+
+/* Link objects into the lower 2Gb on x86_64 and AArch64.  GHC assumes the
+ * small memory model on this architecture (see gcc docs,
+ * -mcmodel=small).
+ *
+ * MAP_32BIT not available on OpenBSD/amd64
+ */
+#if defined(MAP_32BIT) && (defined(x86_64_HOST_ARCH) || (defined(aarch64_TARGET_ARCH) || defined(aarch64_HOST_ARCH)))
+#define MAP_LOW_MEM
+#define TRY_MAP_32BIT MAP_32BIT
+#else
+#define TRY_MAP_32BIT 0
+#endif
+
+/* MAP_ANONYMOUS is MAP_ANON on some systems,
+   e.g. OS X (before Sierra), OpenBSD etc */
+#if !defined(MAP_ANONYMOUS) && defined(MAP_ANON)
+#define MAP_ANONYMOUS MAP_ANON
+#endif
+
+/* In order to simplify control flow a bit, some references to mmap-related
+   definitions are blocked off by a C-level if statement rather than a CPP-level
+   #if statement. Since those are dead branches when !RTS_LINKER_USE_MMAP, we
+   just stub out the relevant symbols here
+*/
+#if !RTS_LINKER_USE_MMAP
+#define munmap(x,y) /* nothing */
+#define MAP_ANONYMOUS 0
+#endif
+
+void *mmap_32bit_base = LINKER_LOAD_BASE;
+
+static const char *memoryAccessDescription(MemoryAccess mode)
+{
+  switch (mode) {
+  case MEM_NO_ACCESS:    return "no-access";
+  case MEM_READ_ONLY:    return "read-only";
+  case MEM_READ_WRITE:   return "read-write";
+  case MEM_READ_EXECUTE: return "read-execute";
+  case MEM_READ_WRITE_EXECUTE:
+                         return "read-write-execute";
+  default: barf("invalid MemoryAccess");
+  }
+}
+
+#if defined(mingw32_HOST_OS)
+
+static DWORD
+memoryAccessToProt(MemoryAccess access)
+{
+  switch (access) {
+  case MEM_NO_ACCESS:    return PAGE_NOACCESS;
+  case MEM_READ_ONLY:    return PAGE_READONLY;
+  case MEM_READ_WRITE:   return PAGE_READWRITE;
+  case MEM_READ_EXECUTE: return PAGE_EXECUTE_READ;
+  case MEM_READ_WRITE_EXECUTE:
+                         return PAGE_EXECUTE_READWRITE;
+  default: barf("invalid MemoryAccess");
+  }
+}
+
+//
+// Returns NULL on failure.
+//
+void *
+mmapAnonForLinker (size_t bytes)
+{
+  return VirtualAlloc(NULL, bytes, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE);
+}
+
+void
+munmapForLinker (void *addr, size_t bytes, const char *caller)
+{
+  if (VirtualFree(addr, 0, MEM_RELEASE) == 0) {
+    sysErrorBelch("munmapForLinker: %s: Failed to unmap %zd bytes at %p",
+                  caller, bytes, addr);
+  }
+}
+
+/**
+ * Change the allowed access modes of a region of memory previously allocated
+ * with mmapAnonForLinker.
+ */
+void
+mprotectForLinker(void *start, size_t len, MemoryAccess mode)
+{
+  DWORD old;
+  if (len == 0) {
+    return;
+  }
+  DWORD prot = memoryAccessToProt(mode);
+
+  if (VirtualProtect(start, len, prot, &old) == 0) {
+    sysErrorBelch("mprotectForLinker: failed to protect %zd bytes at %p as %s",
+                  len, start, memoryAccessDescription(mode));
+    ASSERT(false);
+  }
+}
+
+#elif RTS_LINKER_USE_MMAP
+
+static int
+memoryAccessToProt(MemoryAccess access)
+{
+    switch (access) {
+    case MEM_NO_ACCESS:    return 0;
+    case MEM_READ_ONLY:    return PROT_READ;
+    case MEM_READ_WRITE:   return PROT_READ | PROT_WRITE;
+    case MEM_READ_EXECUTE: return PROT_READ | PROT_EXEC;
+    case MEM_READ_WRITE_EXECUTE:
+                           return PROT_READ | PROT_WRITE | PROT_EXEC;
+    default: barf("invalid MemoryAccess");
+    }
+}
+
+//
+// Returns NULL on failure.
+//
+void *
+mmapForLinker (size_t bytes, MemoryAccess access, uint32_t flags, int fd, int offset)
+{
+   void *map_addr = NULL;
+   void *result;
+   size_t size;
+   uint32_t tryMap32Bit = RtsFlags.MiscFlags.linkerAlwaysPic
+     ? 0
+     : TRY_MAP_32BIT;
+   static uint32_t fixed = 0;
+   int prot = memoryAccessToProt(access);
+
+   IF_DEBUG(linker, debugBelch("mmapForLinker: start\n"));
+   size = roundUpToPage(bytes);
+
+#if defined(MAP_LOW_MEM)
+mmap_again:
+#endif
+
+   if (mmap_32bit_base != NULL) {
+       map_addr = mmap_32bit_base;
+   }
+
+   IF_DEBUG(linker,
+            debugBelch("mmapForLinker: \tprotection %#0x\n", prot));
+   IF_DEBUG(linker,
+            debugBelch("mmapForLinker: \tflags      %#0x\n",
+                       MAP_PRIVATE | tryMap32Bit | fixed | flags));
+   IF_DEBUG(linker,
+            debugBelch("mmapForLinker: \tsize       %#0zx\n", bytes));
+   IF_DEBUG(linker,
+            debugBelch("mmapForLinker: \tmap_addr   %p\n", map_addr));
+
+   result = mmap(map_addr, size, prot,
+                 MAP_PRIVATE|tryMap32Bit|fixed|flags, fd, offset);
+
+   if (result == MAP_FAILED) {
+       reportMemoryMap();
+       sysErrorBelch("mmap %" FMT_Word " bytes at %p",(W_)size,map_addr);
+       errorBelch("Try specifying an address with +RTS -xm<addr> -RTS");
+       return NULL;
+   }
+
+#if defined(MAP_LOW_MEM)
+   if (RtsFlags.MiscFlags.linkerAlwaysPic) {
+       /* make no attempt at mapping low memory if we are assuming PIC */
+   } else if (mmap_32bit_base != NULL) {
+       if (result != map_addr) {
+           if ((W_)result > 0x80000000) {
+               // oops, we were given memory over 2Gb
+               munmap(result,size);
+#if defined(MAP_TRYFIXED)
+               // Some platforms require MAP_FIXED. We use MAP_TRYFIXED since
+               // MAP_FIXED will overwrite existing mappings.
+               fixed = MAP_TRYFIXED;
+               goto mmap_again;
+#else
+               reportMemoryMap();
+               errorBelch("mmapForLinker: failed to mmap() memory below 2Gb; "
+                          "asked for %lu bytes at %p. "
+                          "Try specifying an address with +RTS -xm<addr> -RTS",
+                          size, map_addr);
+               return NULL;
+#endif
+           } else {
+               // hmm, we were given memory somewhere else, but it's
+               // still under 2Gb so we can use it.
+           }
+       }
+   } else {
+       if ((W_)result > 0x80000000) {
+           // oops, we were given memory over 2Gb
+           // ... try allocating memory somewhere else?;
+           debugTrace(DEBUG_linker,
+                      "MAP_32BIT didn't work; gave us %lu bytes at 0x%p",
+                      bytes, result);
+           munmap(result, size);
+
+           // Set a base address and try again... (guess: 1Gb)
+           mmap_32bit_base = (void*)0x40000000;
+           goto mmap_again;
+       }
+   }
+#elif (defined(aarch64_TARGET_ARCH) || defined(aarch64_HOST_ARCH))
+    // for aarch64 we need to make sure we stay within 4GB of the
+    // mmap_32bit_base, and we also do not want to update it.
+    if (result != map_addr) {
+        // upper limit 4GB - size of the object file - 1mb wiggle room.
+        if(llabs((uintptr_t)result - (uintptr_t)&stg_upd_frame_info) > (2<<32) - size - (2<<20)) {
+            // not within range :(
+            debugTrace(DEBUG_linker,
+                        "MAP_32BIT didn't work; gave us %lu bytes at 0x%p",
+                        bytes, result);
+            munmap(result, size);
+            // TODO: some abort/mmap_32bit_base recomputation based on
+            //       if mmap_32bit_base is changed, or still at stg_upd_frame_info
+            goto mmap_again;
+        }
+    }
+#endif
+
+    if (mmap_32bit_base != NULL) {
+       // Next time, ask for memory right after our new mapping to maximize the
+       // chance that we get low memory.
+        mmap_32bit_base = (void*) ((uintptr_t)result + size);
+    }
+
+    IF_DEBUG(linker,
+             debugBelch("mmapForLinker: mapped %" FMT_Word
+                        " bytes starting at %p\n", (W_)size, result));
+    IF_DEBUG(linker,
+             debugBelch("mmapForLinker: done\n"));
+
+    return result;
+}
+
+/*
+ * Map read/write pages in low memory. Returns NULL on failure.
+ */
+void *
+mmapAnonForLinker (size_t bytes)
+{
+  return mmapForLinker (bytes, MEM_READ_WRITE, MAP_ANONYMOUS, -1, 0);
+}
+
+void munmapForLinker (void *addr, size_t bytes, const char *caller)
+{
+  int r = munmap(addr, bytes);
+  if (r == -1) {
+    // Should we abort here?
+    sysErrorBelch("munmap: %s", caller);
+  }
+}
+
+/* Note [Memory protection in the linker]
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ * For many years the linker would simply map all of its memory
+ * with PROT_READ|PROT_WRITE|PROT_EXEC. However operating systems have been
+ * becoming increasingly reluctant to accept this practice (e.g. #17353,
+ * #12657) and for good reason: writable code is ripe for exploitation.
+ *
+ * Consequently mmapForLinker now maps its memory with PROT_READ|PROT_WRITE.
+ * After the linker has finished filling/relocating the mapping it must then
+ * call mprotectForLinker on the sections of the mapping which
+ * contain executable code.
+ *
+ * Note that the m32 allocator handles protection of its allocations. For this
+ * reason the caller to m32_alloc() must tell the allocator whether the
+ * allocation needs to be executable. The caller must then ensure that they
+ * call m32_allocator_flush() after they are finished filling the region, which
+ * will cause the allocator to change the protection bits to
+ * PROT_READ|PROT_EXEC.
+ *
+ */
+
+/*
+ * Mark an portion of a mapping previously reserved by mmapForLinker
+ * as executable (but not writable).
+ */
+void mprotectForLinker(void *start, size_t len, MemoryAccess mode)
+{
+    if (len == 0) {
+      return;
+    }
+    IF_DEBUG(linker,
+             debugBelch("mprotectForLinker: protecting %" FMT_Word
+                        " bytes starting at %p as %s\n",
+                        (W_)len, start, memoryAccessDescription(mode)));
+
+    int prot = memoryAccessToProt(mode);
+
+    if (mprotect(start, len, prot) == -1) {
+        sysErrorBelch("mprotectForLinker: failed to protect %zd bytes at %p as %s",
+                      len, start, memoryAccessDescription(mode));
+    }
+}
+#endif


=====================================
rts/linker/MMap.h
=====================================
@@ -0,0 +1,80 @@
+#pragma once
+
+#include "BeginPrivate.h"
+
+#if defined(aarch64_HOST_ARCH)
+// On AArch64 MAP_32BIT is not available but we are still bound by the small
+// memory model. Consequently we still try using the MAP_LOW_MEM allocation
+// strategy.
+#define MAP_LOW_MEM
+#endif
+
+/*
+ * Note [MAP_LOW_MEM]
+ * ~~~~~~~~~~~~~~~~~~
+ * Due to the small memory model (see above), on x86_64 and AArch64 we have to
+ * map all our non-PIC object files into the low 2Gb of the address space (why
+ * 2Gb and not 4Gb?  Because all addresses must be reachable using a 32-bit
+ * signed PC-relative offset). On x86_64 Linux we can do this using the
+ * MAP_32BIT flag to mmap(), however on other OSs (e.g. *BSD, see #2063, and
+ * also on Linux inside Xen, see #2512), we can't do this.  So on these
+ * systems, we have to pick a base address in the low 2Gb of the address space
+ * and try to allocate memory from there.
+ *
+ * The same holds for aarch64, where the default, even with PIC, model
+ * is 4GB. The linker is free to emit AARCH64_ADR_PREL_PG_HI21
+ * relocations.
+ *
+ * We pick a default address based on the OS, but also make this
+ * configurable via an RTS flag (+RTS -xm)
+ */
+
+#if defined(aarch64_TARGET_ARCH) || defined(aarch64_HOST_ARCH)
+// Try to use stg_upd_frame_info as the base. We need to be within +-4GB of that
+// address, otherwise we violate the aarch64 memory model. Any object we load
+// can potentially reference any of the ones we bake into the binary (and list)
+// in RtsSymbols. Thus we'll need to be within +-4GB of those,
+// stg_upd_frame_info is a good candidate as it's referenced often.
+#define LINKER_LOAD_BASE ((void *) &stg_upd_frame_info)
+#elif defined(x86_64_HOST_ARCH) && defined(mingw32_HOST_OS)
+// On Windows (which now uses high-entropy ASLR by default) we need to ensure
+// that we map code near the executable image. We use stg_upd_frame_info as a
+// proxy for the image location.
+#define LINKER_LOAD_BASE ((void *) &stg_upd_frame_info)
+#elif defined(MAP_32BIT) || DEFAULT_LINKER_ALWAYS_PIC
+// Try to use MAP_32BIT
+#define LINKER_LOAD_BASE ((void *) 0x0)
+#else
+// A guess: 1 GB.
+#define LINKER_LOAD_BASE ((void *) 0x40000000)
+#endif
+
+/** Access modes for mprotectForLinker */
+typedef enum {
+    MEM_NO_ACCESS,
+    MEM_READ_ONLY,
+    MEM_READ_WRITE,
+    MEM_READ_EXECUTE,
+    MEM_READ_WRITE_EXECUTE,
+} MemoryAccess;
+
+extern void *mmap_32bit_base;
+
+// Map read/write anonymous memory.
+void *mmapAnonForLinker (size_t bytes);
+
+// Change protection of previous mapping memory.
+void mprotectForLinker(void *start, size_t len, MemoryAccess mode);
+
+// Release a mapping.
+void munmapForLinker (void *addr, size_t bytes, const char *caller);
+
+#if !defined(mingw32_HOST_OS)
+// Map a file.
+//
+// Note that this not available on Windows since file mapping on Windows is
+// sufficiently different to warrant its own interface.
+void *mmapForLinker (size_t bytes, MemoryAccess prot, uint32_t flags, int fd, int offset);
+#endif
+
+#include "EndPrivate.h"


=====================================
rts/linker/MachO.c
=====================================
@@ -1210,7 +1210,7 @@ ocGetNames_MachO(ObjectCode* oc)
                 unsigned nstubs = numberOfStubsForSection(oc, sec_idx);
                 unsigned stub_space = STUB_SIZE * nstubs;
 
-                void * mem = mmapForLinker(section->size+stub_space, PROT_READ | PROT_WRITE, MAP_ANON, -1, 0);
+                void * mem = mmapForLinker(section->size+stub_space, MEM_READ_WRITE, MAP_ANON, -1, 0);
 
                 if( mem == MAP_FAILED ) {
                     sysErrorBelch("failed to mmap allocated memory to load section %d. "
@@ -1428,7 +1428,7 @@ ocMprotect_MachO( ObjectCode *oc )
         if(segment->size == 0) continue;
 
         if(segment->prot == SEGMENT_PROT_RX) {
-            mmapForLinkerMarkExecutable(segment->start, segment->size);
+            mprotectForLinker(segment->start, segment->size, MEM_READ_EXECUTE);
         }
     }
 
@@ -1443,7 +1443,7 @@ ocMprotect_MachO( ObjectCode *oc )
         if(section->alloc == SECTION_M32) continue;
         switch (section->kind) {
         case SECTIONKIND_CODE_OR_RODATA: {
-            mmapForLinkerMarkExecutable(section->mapped_start, section->mapped_size);
+            mprotectForLinker(section->mapped_start, section->mapped_size, MEM_READ_EXECUTE);
             break;
         }
         default:


=====================================
rts/linker/SymbolExtras.c
=====================================
@@ -10,6 +10,7 @@
  */
 
 #include "LinkerInternals.h"
+#include "linker/MMap.h"
 
 #if defined(NEED_SYMBOL_EXTRAS)
 #if !defined(x86_64_HOST_ARCH) || !defined(mingw32_HOST_OS)
@@ -142,7 +143,7 @@ void ocProtectExtras(ObjectCode* oc)
      * non-executable.
      */
   } else if (USE_CONTIGUOUS_MMAP || RtsFlags.MiscFlags.linkerAlwaysPic) {
-    mmapForLinkerMarkExecutable(oc->symbol_extras, sizeof(SymbolExtra) * oc->n_symbol_extras);
+    mprotectForLinker(oc->symbol_extras, sizeof(SymbolExtra) * oc->n_symbol_extras, MEM_READ_EXECUTE);
   } else {
     /*
      * The symbol extras were allocated via m32. They will be protected when


=====================================
rts/linker/elf_got.c
=====================================
@@ -1,5 +1,6 @@
 #include "Rts.h"
 #include "elf_got.h"
+#include "linker/MMap.h"
 
 #include <string.h>
 


=====================================
rts/rts.cabal.in
=====================================
@@ -475,6 +475,7 @@ library
                Libdw.c
                LibdwPool.c
                Linker.c
+               ReportMemoryMap.c
                Messages.c
                OldARMAtomic.c
                PathUtils.c
@@ -532,6 +533,7 @@ library
                linker/Elf.c
                linker/LoadArchive.c
                linker/M32Alloc.c
+               linker/MMap.c
                linker/MachO.c
                linker/macho/plt.c
                linker/macho/plt_aarch64.c



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/b81cd709df8054b8b98ac05d3b9affcee9a8b840...b361bcb04ff46aaac0e4534b695b715b7c39be98

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/b81cd709df8054b8b98ac05d3b9affcee9a8b840...b361bcb04ff46aaac0e4534b695b715b7c39be98
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20230525/4a0ce39e/attachment-0001.html>


More information about the ghc-commits mailing list