[Git][ghc/ghc][wip/romes/rts-linker-direct-symbol-lookup] 3 commits: Refactor loadNativeObj_ELF to _POSIX and delete dl_mutex

Rodrigo Mesquita (@alt-romes) gitlab at gitlab.haskell.org
Mon Apr 1 10:33:56 UTC 2024



Rodrigo Mesquita pushed to branch wip/romes/rts-linker-direct-symbol-lookup at Glasgow Haskell Compiler / GHC


Commits:
4381c150 by Rodrigo Mesquita at 2024-04-01T11:33:40+01:00
Refactor loadNativeObj_ELF to _POSIX and delete dl_mutex

CPP Support loadNativeObj in MachO

- - - - -
7fa524ab by Rodrigo Mesquita at 2024-04-01T11:33:40+01:00
Use symbol cache in internal interpreter too

This commit makes the symbol cache that was used by the external
interpreter available for the internal interpreter too.

This follows from the analysis in #23415 that suggests the internal
interpreter could benefit from this cache too, and that there is no good
reason not to have the cache for it too. It also makes it a bit more
uniform to have the symbol cache range over both the internal and
external interpreter.

This commit also refactors the cache into a function which is used by
both `lookupSymbol` and also by `lookupSymbolInDLL`, extending the
caching logic to `lookupSymbolInDLL` too.

- - - - -
ab309fc8 by Rodrigo Mesquita at 2024-04-01T11:33:40+01:00
Implement lookupSymbolInDLL for ExternalInterp

- - - - -


12 changed files:

- compiler/GHC.hs
- compiler/GHC/Driver/Main.hs
- compiler/GHC/Runtime/Interpreter.hs
- compiler/GHC/Runtime/Interpreter/JS.hs
- compiler/GHC/Runtime/Interpreter/Types.hs
- rts/Linker.c
- rts/LinkerInternals.h
- rts/linker/Elf.c
- rts/linker/Elf.h
- + rts/linker/LoadNativeObjPosix.c
- + rts/linker/LoadNativeObjPosix.h
- rts/rts.cabal


Changes:

=====================================
compiler/GHC.hs
=====================================
@@ -394,6 +394,7 @@ import GHC.Types.Name.Ppr
 import GHC.Types.TypeEnv
 import GHC.Types.BreakInfo
 import GHC.Types.PkgQual
+import GHC.Types.Unique.FM
 
 import GHC.Unit
 import GHC.Unit.Env
@@ -673,6 +674,7 @@ setTopSessionDynFlags :: GhcMonad m => DynFlags -> m ()
 setTopSessionDynFlags dflags = do
   hsc_env <- getSession
   logger  <- getLogger
+  lookup_cache  <- liftIO $ newMVar emptyUFM
 
   -- Interpreter
   interp <- if
@@ -702,7 +704,7 @@ setTopSessionDynFlags dflags = do
             }
          s <- liftIO $ newMVar InterpPending
          loader <- liftIO Loader.uninitializedLoader
-         return (Just (Interp (ExternalInterp (ExtIServ (ExtInterpState conf s))) loader))
+         return (Just (Interp (ExternalInterp (ExtIServ (ExtInterpState conf s))) loader lookup_cache))
 
     -- JavaScript interpreter
     | ArchJavaScript <- platformArch (targetPlatform dflags)
@@ -720,7 +722,7 @@ setTopSessionDynFlags dflags = do
               , jsInterpFinderOpts  = initFinderOpts dflags
               , jsInterpFinderCache = hsc_FC hsc_env
               }
-         return (Just (Interp (ExternalInterp (ExtJS (ExtInterpState cfg s))) loader))
+         return (Just (Interp (ExternalInterp (ExtJS (ExtInterpState cfg s))) loader lookup_cache))
 
     -- Internal interpreter
     | otherwise
@@ -728,7 +730,7 @@ setTopSessionDynFlags dflags = do
 #if defined(HAVE_INTERNAL_INTERPRETER)
      do
       loader <- liftIO Loader.uninitializedLoader
-      return (Just (Interp InternalInterp loader))
+      return (Just (Interp InternalInterp loader lookup_cache))
 #else
       return Nothing
 #endif


=====================================
compiler/GHC/Driver/Main.hs
=====================================
@@ -2665,7 +2665,7 @@ hscCompileCoreExpr' hsc_env srcspan ds_expr = do
 
   case interp of
     -- always generate JS code for the JS interpreter (no bytecode!)
-    Interp (ExternalInterp (ExtJS i)) _ ->
+    Interp (ExternalInterp (ExtJS i)) _ _ ->
       jsCodeGen hsc_env srcspan i this_mod stg_binds_with_deps binding_id
 
     _ -> do


=====================================
compiler/GHC/Runtime/Interpreter.hs
=====================================
@@ -152,22 +152,22 @@ The main pieces are:
   - implementation of Template Haskell (GHCi.TH)
   - a few other things needed to run interpreted code
 
-- top-level iserv directory, containing the codefor the external
-  server.  This is a fairly simple wrapper, most of the functionality
+- top-level iserv directory, containing the code for the external
+  server. This is a fairly simple wrapper, most of the functionality
   is provided by modules in libraries/ghci.
 
 - This module which provides the interface to the server used
   by the rest of GHC.
 
-GHC works with and without -fexternal-interpreter.  With the flag, all
-interpreted code is run by the iserv binary.  Without the flag,
+GHC works with and without -fexternal-interpreter. With the flag, all
+interpreted code is run by the iserv binary. Without the flag,
 interpreted code is run in the same process as GHC.
 
 Things that do not work with -fexternal-interpreter
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 dynCompileExpr cannot work, because we have no way to run code of an
-unknown type in the remote process.  This API fails with an error
+unknown type in the remote process. This API fails with an error
 message if it is used with -fexternal-interpreter.
 
 Other Notes on Remote GHCi
@@ -441,52 +441,71 @@ initObjLinker :: Interp -> IO ()
 initObjLinker interp = interpCmd interp InitLinker
 
 lookupSymbol :: Interp -> FastString -> IO (Maybe (Ptr ()))
-lookupSymbol interp str = case interpInstance interp of
+lookupSymbol interp str = withSymbolCache interp str $
+  case interpInstance interp of
 #if defined(HAVE_INTERNAL_INTERPRETER)
-  InternalInterp -> fmap fromRemotePtr <$> run (LookupSymbol (unpackFS str))
+    InternalInterp -> fmap fromRemotePtr <$> run (LookupSymbol (unpackFS str))
 #endif
-
-  ExternalInterp ext -> case ext of
-    ExtIServ i -> withIServ i $ \inst -> do
-      -- Profiling of GHCi showed a lot of time and allocation spent
-      -- making cross-process LookupSymbol calls, so I added a GHC-side
-      -- cache which sped things up quite a lot.  We have to be careful
-      -- to purge this cache when unloading code though.
-      cache <- readMVar (instLookupSymbolCache inst)
-      case lookupUFM cache str of
-        Just p -> return (Just p)
-        Nothing -> do
-          m <- uninterruptibleMask_ $
-                   sendMessage inst (LookupSymbol (unpackFS str))
-          case m of
-            Nothing -> return Nothing
-            Just r -> do
-              let p        = fromRemotePtr r
-                  cache'   = addToUFM cache str p
-              modifyMVar_ (instLookupSymbolCache inst) (const (pure cache'))
-              return (Just p)
-
-    ExtJS {} -> pprPanic "lookupSymbol not supported by the JS interpreter" (ppr str)
+    ExternalInterp ext -> case ext of
+      ExtIServ i -> withIServ i $ \inst -> fmap fromRemotePtr <$> do
+        uninterruptibleMask_ $
+          sendMessage inst (LookupSymbol (unpackFS str))
+      ExtJS {} -> pprPanic "lookupSymbol not supported by the JS interpreter" (ppr str)
 
 lookupSymbolInDLL :: Interp -> RemotePtr LoadedDLL -> FastString -> IO (Maybe (Ptr ()))
-lookupSymbolInDLL interp dll str = case interpInstance interp of
+lookupSymbolInDLL interp dll str = withSymbolCache interp str $
+  case interpInstance interp of
 #if defined(HAVE_INTERNAL_INTERPRETER)
-  InternalInterp -> fmap fromRemotePtr <$> run (LookupSymbolInDLL dll (unpackFS str))
+    InternalInterp -> fmap fromRemotePtr <$> run (LookupSymbolInDLL dll (unpackFS str))
 #endif
-  ExternalInterp _ -> panic "lookupSymbolInDLL: not implemented for external interpreter" -- FIXME
+    ExternalInterp ext -> case ext of
+      ExtIServ i -> withIServ i $ \inst -> fmap fromRemotePtr <$> do
+        uninterruptibleMask_ $
+          sendMessage inst (LookupSymbolInDLL dll (unpackFS str))
+      ExtJS {} -> pprPanic "lookupSymbol not supported by the JS interpreter" (ppr str)
 
 lookupClosure :: Interp -> String -> IO (Maybe HValueRef)
 lookupClosure interp str =
   interpCmd interp (LookupClosure str)
 
+-- | 'withSymbolCache' tries to find a symbol in the 'interpLookupSymbolCache'
+-- which maps symbols to the address where they are loaded.
+-- When there's a cache hit we simply return the cached address, when there is
+-- a miss we run the action which determines the symbol's address and populate
+-- the cache with the answer.
+withSymbolCache :: Interp
+                -> FastString
+                -- ^ The symbol we are looking up in the cache
+                -> IO (Maybe (Ptr ()))
+                -- ^ An action which determines the address of the symbol we
+                -- are looking up in the cache, which is run if there is a
+                -- cache miss. The result will be cached.
+                -> IO (Maybe (Ptr ()))
+withSymbolCache interp str determine_addr = do
+
+  -- Profiling of GHCi showed a lot of time and allocation spent
+  -- making cross-process LookupSymbol calls, so I added a GHC-side
+  -- cache which sped things up quite a lot. We have to be careful
+  -- to purge this cache when unloading code though.
+  --
+  -- The analysis in #23415 further showed this cache should also benefit the
+  -- internal interpreter's loading times, and needn't be used by the external
+  -- interpreter only.
+  cache <- readMVar (interpLookupSymbolCache interp)
+  case lookupUFM cache str of
+    Just p -> return (Just p)
+    Nothing -> do
+
+      maddr <- determine_addr
+      case maddr of
+        Nothing -> return Nothing
+        Just p -> do
+          let upd_cache cache' = addToUFM cache' str p
+          modifyMVar_ (interpLookupSymbolCache interp) (pure . upd_cache)
+          return (Just p)
+
 purgeLookupSymbolCache :: Interp -> IO ()
-purgeLookupSymbolCache interp = case interpInstance interp of
-#if defined(HAVE_INTERNAL_INTERPRETER)
-  InternalInterp -> pure ()
-#endif
-  ExternalInterp ext -> withExtInterpMaybe ext $ \case
-    Nothing   -> pure () -- interpreter stopped, nothing to do
-    Just inst -> modifyMVar_ (instLookupSymbolCache inst) (const (pure emptyUFM))
+purgeLookupSymbolCache interp = putMVar (interpLookupSymbolCache interp) emptyUFM
 
 -- | loadDLL loads a dynamic library using the OS's native linker
 -- (i.e. dlopen() on Unix, LoadLibrary() on Windows).  It takes either
@@ -552,11 +571,9 @@ spawnIServ conf = do
                   }
 
   pending_frees <- newMVar []
-  lookup_cache  <- newMVar emptyUFM
   let inst = ExtInterpInstance
         { instProcess           = process
         , instPendingFrees      = pending_frees
-        , instLookupSymbolCache = lookup_cache
         , instExtra             = ()
         }
   pure inst


=====================================
compiler/GHC/Runtime/Interpreter/JS.hs
=====================================
@@ -41,7 +41,6 @@ import GHC.Utils.Panic
 import GHC.Utils.Error (logInfo)
 import GHC.Utils.Outputable (text)
 import GHC.Data.FastString
-import GHC.Types.Unique.FM
 
 import Control.Concurrent
 import Control.Monad
@@ -178,11 +177,9 @@ spawnJSInterp cfg = do
         }
 
   pending_frees <- newMVar []
-  lookup_cache  <- newMVar emptyUFM
   let inst = ExtInterpInstance
         { instProcess           = proc
         , instPendingFrees      = pending_frees
-        , instLookupSymbolCache = lookup_cache
         , instExtra             = extra
         }
 


=====================================
compiler/GHC/Runtime/Interpreter/Types.hs
=====================================
@@ -51,6 +51,9 @@ data Interp = Interp
 
   , interpLoader   :: !Loader
       -- ^ Interpreter loader
+
+  , interpLookupSymbolCache :: !(MVar (UniqFM FastString (Ptr ())))
+      -- ^ LookupSymbol cache
   }
 
 data InterpInstance
@@ -108,9 +111,6 @@ data ExtInterpInstance c = ExtInterpInstance
       -- Finalizers for ForeignRefs can append values to this list
       -- asynchronously.
 
-  , instLookupSymbolCache :: !(MVar (UniqFM FastString (Ptr ())))
-      -- ^ LookupSymbol cache
-
   , instExtra             :: !c
       -- ^ Instance specific extra fields
   }


=====================================
rts/Linker.c
=====================================
@@ -77,6 +77,10 @@
 #  include <mach-o/fat.h>
 #endif
 
+#if defined(OBJFORMAT_ELF) || defined(OBJFORMAT_MACHO)
+#  include "linker/LoadNativeObjPosix.h"
+#endif
+
 #if defined(dragonfly_HOST_OS)
 #include <sys/tls.h>
 #endif
@@ -130,7 +134,7 @@ extern void iconv();
    - Indexing (e.g. ocVerifyImage and ocGetNames)
    - Initialization (e.g. ocResolve)
    - RunInit (e.g. ocRunInit)
-   - Lookup (e.g. lookupSymbol)
+   - Lookup (e.g. lookupSymbol/lookupSymbolInDLL)
 
    This is to enable lazy loading of symbols. Eager loading is problematic
    as it means that all symbols must be available, even those which we will
@@ -419,9 +423,6 @@ static int linker_init_done = 0 ;
 static void *dl_prog_handle;
 static regex_t re_invalid;
 static regex_t re_realso;
-#if defined(THREADED_RTS)
-Mutex dl_mutex; // mutex to protect dlopen/dlerror critical section
-#endif
 #endif
 
 void initLinker (void)
@@ -455,9 +456,6 @@ initLinker_ (int retain_cafs)
 
 #if defined(THREADED_RTS)
     initMutex(&linker_mutex);
-#if defined(OBJFORMAT_ELF) || defined(OBJFORMAT_MACHO)
-    initMutex(&dl_mutex);
-#endif
 #endif
 
     symhash = allocStrHashTable();
@@ -520,9 +518,6 @@ exitLinker( void ) {
    if (linker_init_done == 1) {
       regfree(&re_invalid);
       regfree(&re_realso);
-#if defined(THREADED_RTS)
-      closeMutex(&dl_mutex);
-#endif
    }
 #endif
    if (linker_init_done == 1) {
@@ -578,8 +573,8 @@ static void *
 internal_dlsym(const char *symbol) {
     void *v;
 
-    // We acquire dl_mutex as concurrent dl* calls may alter dlerror
-    ACQUIRE_LOCK(&dl_mutex);
+    // concurrent dl* calls may alter dlerror
+    ASSERT_LOCK_HELD(&linker_mutex);
 
     // clears dlerror
     dlerror();
@@ -587,7 +582,6 @@ internal_dlsym(const char *symbol) {
     // look in program first
     v = dlsym(dl_prog_handle, symbol);
     if (dlerror() == NULL) {
-        RELEASE_LOCK(&dl_mutex);
         IF_DEBUG(linker, debugBelch("internal_dlsym: found symbol '%s' in program\n", symbol));
         return v;
     }
@@ -597,12 +591,10 @@ internal_dlsym(const char *symbol) {
           v = dlsym(nc->dlopen_handle, symbol);
           if (dlerror() == NULL) {
             IF_DEBUG(linker, debugBelch("internal_dlsym: found symbol '%s' in shared object\n", symbol));
-            RELEASE_LOCK(&dl_mutex);
             return v;
           }
         }
     }
-    RELEASE_LOCK(&dl_mutex);
 
     IF_DEBUG(linker, debugBelch("internal_dlsym: looking for symbol '%s' in special cases\n", symbol));
 #   define SPECIAL_SYMBOL(sym) \
@@ -645,14 +637,13 @@ internal_dlsym(const char *symbol) {
 
 void *lookupSymbolInNativeObj(void *handle, const char *symbol_name)
 {
+    ASSERT_LOCK_HELD(&linker_mutex);
+
 #if defined(OBJFORMAT_MACHO)
     CHECK(symbol_name[0] == '_');
     symbol_name = symbol_name+1;
 #endif
-
-    ACQUIRE_LOCK(&dl_mutex); // dlsym alters dlerror
     void *result = dlsym(handle, symbol_name);
-    RELEASE_LOCK(&dl_mutex);
     return result;
 }
 #  endif
@@ -1101,9 +1092,10 @@ void freeObjectCode (ObjectCode *oc)
 
     if (oc->type == DYNAMIC_OBJECT) {
 #if defined(OBJFORMAT_ELF)
-        ACQUIRE_LOCK(&dl_mutex);
-        freeNativeCode_ELF(oc);
-        RELEASE_LOCK(&dl_mutex);
+        // ROMES:TODO: This previously acquired dl_mutex. Check that acquiring linker_mutex here is fine.
+        ACQUIRE_LOCK(&linker_mutex);
+        freeNativeCode_POSIX(oc);
+        RELEASE_LOCK(&linker_mutex);
 #else
         barf("freeObjectCode: This shouldn't happen");
 #endif
@@ -1868,73 +1860,20 @@ addSection (Section *s, SectionKind kind, SectionAlloc alloc,
 
 #define UNUSED(x) (void)(x)
 
-#if defined(OBJFORMAT_ELF)
+#if defined(OBJFORMAT_ELF) || defined(OBJFORMAT_MACHO)
 void * loadNativeObj (pathchar *path, char **errmsg)
 {
    IF_DEBUG(linker, debugBelch("loadNativeObj: path = '%s'\n", path));
    ACQUIRE_LOCK(&linker_mutex);
-   void *r = loadNativeObj_ELF(path, errmsg);
+   void *r = loadNativeObj_POSIX(path, errmsg);
 
-   if (r) {
-     RELEASE_LOCK(&linker_mutex);
-     return r;
-   }
-
-   // GHC #2615
-   // On some systems (e.g., Gentoo Linux) dynamic files (e.g. libc.so)
-   // contain linker scripts rather than ELF-format object code. This
-   // code handles the situation by recognizing the real object code
-   // file name given in the linker script.
-   //
-   // If an "invalid ELF header" error occurs, it is assumed that the
-   // .so file contains a linker script instead of ELF object code.
-   // In this case, the code looks for the GROUP ( ... ) linker
-   // directive. If one is found, the first file name inside the
-   // parentheses is treated as the name of a dynamic library and the
-   // code attempts to dlopen that file. If this is also unsuccessful,
-   // an error message is returned.
-
-#define NMATCH 5
-   regmatch_t match[NMATCH];
-   FILE* fp;
-   size_t match_length;
-#define MAXLINE 1000
-   char line[MAXLINE];
-   int result;
-
-   // see if the error message is due to an invalid ELF header
-   IF_DEBUG(linker, debugBelch("errmsg = '%s'\n", *errmsg));
-   result = regexec(&re_invalid, *errmsg, (size_t) NMATCH, match, 0);
-   IF_DEBUG(linker, debugBelch("result = %i\n", result));
-   if (result == 0) {
-      // success -- try to read the named file as a linker script
-      match_length = (size_t) stg_min((match[1].rm_eo - match[1].rm_so),
-                                 MAXLINE-1);
-      strncpy(line, (*errmsg+(match[1].rm_so)),match_length);
-      line[match_length] = '\0'; // make sure string is null-terminated
-      IF_DEBUG(linker, debugBelch("file name = '%s'\n", line));
-      if ((fp = __rts_fopen(line, "r")) == NULL) {
-         RELEASE_LOCK(&linker_mutex);
-         // return original error if open fails
-         return NULL;
-      }
-      // try to find a GROUP or INPUT ( ... ) command
-      while (fgets(line, MAXLINE, fp) != NULL) {
-         IF_DEBUG(linker, debugBelch("input line = %s", line));
-         if (regexec(&re_realso, line, (size_t) NMATCH, match, 0) == 0) {
-            // success -- try to dlopen the first named file
-            IF_DEBUG(linker, debugBelch("match%s\n",""));
-            line[match[2].rm_eo] = '\0';
-            stgFree((void*)*errmsg); // Free old message before creating new one
-            r = loadNativeObj_ELF(line+match[2].rm_so, errmsg);
-            break;
-         }
-         // if control reaches here, no GROUP or INPUT ( ... ) directive
-         // was found and the original error message is returned to the
-         // caller
-      }
-      fclose(fp);
+#if defined(OBJFORMAT_ELF)
+   if (!r) {
+       // Check if native object may be a linker script and try loading a native
+       // object from it
+       r = loadNativeObjFromLinkerScript_ELF(errmsg);
    }
+#endif
 
    RELEASE_LOCK(&linker_mutex);
    return r;
@@ -1982,7 +1921,7 @@ static HsInt unloadNativeObj_(void *handle)
     if (unloadedAnyObj) {
         return 1;
     } else {
-        errorBelch("unloadObjNativeObj_ELF: can't find `%p' to unload", handle);
+        errorBelch("unloadObjNativeObj_: can't find `%p' to unload", handle);
         return 0;
     }
 }


=====================================
rts/LinkerInternals.h
=====================================
@@ -412,10 +412,6 @@ extern Elf_Word shndx_table_uninit_label;
 
 #if defined(THREADED_RTS)
 extern Mutex linker_mutex;
-
-#if defined(OBJFORMAT_ELF) || defined(OBJFORMAT_MACHO)
-extern Mutex dl_mutex;
-#endif
 #endif /* THREADED_RTS */
 
 /* Type of an initializer */


=====================================
rts/linker/Elf.c
=====================================
@@ -28,10 +28,13 @@
 #include "linker/util.h"
 #include "linker/elf_util.h"
 
+#include <fs_rts.h>
 #include <link.h>
 #include <stdlib.h>
 #include <unistd.h>
 #include <string.h>
+#include <regex.h>    // regex is already used by dlopen() so this is OK
+                      // to use here without requiring an additional lib
 #if defined(HAVE_DLFCN_H)
 #include <dlfcn.h>
 #endif
@@ -2069,191 +2072,6 @@ int ocRunFini_ELF( ObjectCode *oc )
     return true;
 }
 
-/*
- * Shared object loading
- */
-
-#if defined(HAVE_DLINFO)
-struct piterate_cb_info {
-  ObjectCode *nc;
-  void *l_addr;   /* base virtual address of the loaded code */
-};
-
-static int loadNativeObjCb_(struct dl_phdr_info *info,
-    size_t _size STG_UNUSED, void *data) {
-  struct piterate_cb_info *s = (struct piterate_cb_info *) data;
-
-  // This logic mimicks _dl_addr_inside_object from glibc
-  // For reference:
-  // int
-  // internal_function
-  // _dl_addr_inside_object (struct link_map *l, const ElfW(Addr) addr)
-  // {
-  //   int n = l->l_phnum;
-  //   const ElfW(Addr) reladdr = addr - l->l_addr;
-  //
-  //   while (--n >= 0)
-  //     if (l->l_phdr[n].p_type == PT_LOAD
-  //         && reladdr - l->l_phdr[n].p_vaddr >= 0
-  //         && reladdr - l->l_phdr[n].p_vaddr < l->l_phdr[n].p_memsz)
-  //       return 1;
-  //   return 0;
-  // }
-
-  if ((void*) info->dlpi_addr == s->l_addr) {
-    int n = info->dlpi_phnum;
-    while (--n >= 0) {
-      if (info->dlpi_phdr[n].p_type == PT_LOAD) {
-        NativeCodeRange* ncr =
-          stgMallocBytes(sizeof(NativeCodeRange), "loadNativeObjCb_");
-        ncr->start = (void*) ((char*) s->l_addr + info->dlpi_phdr[n].p_vaddr);
-        ncr->end = (void*) ((char*) ncr->start + info->dlpi_phdr[n].p_memsz);
-
-        ncr->next = s->nc->nc_ranges;
-        s->nc->nc_ranges = ncr;
-      }
-    }
-  }
-  return 0;
-}
-#endif /* defined(HAVE_DLINFO) */
-
-static void copyErrmsg(char** errmsg_dest, char* errmsg) {
-  if (errmsg == NULL) errmsg = "loadNativeObj_ELF: unknown error";
-  *errmsg_dest = stgMallocBytes(strlen(errmsg)+1, "loadNativeObj_ELF");
-  strcpy(*errmsg_dest, errmsg);
-}
-
-// need dl_mutex
-void freeNativeCode_ELF (ObjectCode *nc) {
-  dlclose(nc->dlopen_handle);
-
-  NativeCodeRange *ncr = nc->nc_ranges;
-  while (ncr) {
-    NativeCodeRange* last_ncr = ncr;
-    ncr = ncr->next;
-    stgFree(last_ncr);
-  }
-}
-
-void * loadNativeObj_ELF (pathchar *path, char **errmsg)
-{
-   ObjectCode* nc;
-   void *hdl, *retval;
-
-   IF_DEBUG(linker, debugBelch("loadNativeObj_ELF %" PATH_FMT "\n", path));
-
-   retval = NULL;
-   ACQUIRE_LOCK(&dl_mutex);
-
-
-   /* If we load the same object multiple times, just return the
-    * already-loaded handle. Note that this is broken if unloadNativeObj
-    * is used, as we don’t do any reference counting; see #24345.
-    */
-   ObjectCode *existing_oc = lookupObjectByPath(path);
-   if (existing_oc && existing_oc->status != OBJECT_UNLOADED) {
-     if (existing_oc->type == DYNAMIC_OBJECT) {
-       retval = existing_oc->dlopen_handle;
-       goto success;
-     }
-     copyErrmsg(errmsg, "loadNativeObj_ELF: already loaded as non-dynamic object");
-     goto dlopen_fail;
-   }
-
-   nc = mkOc(DYNAMIC_OBJECT, path, NULL, 0, false, NULL, 0);
-
-   foreignExportsLoadingObject(nc);
-
-   // When dlopen() loads a profiled dynamic library, it calls the
-   // ctors which will call registerCcsList() to append the defined
-   // CostCentreStacks to CCS_LIST. This execution path starting from
-   // addDLL() was only protected by dl_mutex previously. However,
-   // another thread may be doing other things with the RTS linker
-   // that transitively calls refreshProfilingCCSs() which also
-   // accesses CCS_LIST, and those execution paths are protected by
-   // linker_mutex. So there's a risk of data race that may lead to
-   // segfaults (#24423), and we need to ensure the ctors are also
-   // protected by ccs_mutex.
-#if defined(PROFILING)
-   ACQUIRE_LOCK(&ccs_mutex);
-#endif
-
-   // We want to use RTLD_NOW rather than RTLD_LAZY because in the case that
-   // DLINFO is available we want to learn eagerly about all symbols, however,
-   // there is no ldinfo on macos in which case we prefer using RTLD_LAZY.
-   // ROMES:TODO: ^
-   hdl = dlopen(path, RTLD_NOW|RTLD_LOCAL); /* see Note [RTLD_LOCAL] */
-   nc->dlopen_handle = hdl;
-   nc->status = OBJECT_READY;
-
-#if defined(PROFILING)
-   RELEASE_LOCK(&ccs_mutex);
-#endif
-
-   foreignExportsFinishedLoadingObject();
-
-   if (hdl == NULL) {
-     /* dlopen failed; save the message in errmsg */
-     copyErrmsg(errmsg, dlerror());
-     goto dlopen_fail;
-   }
-
-#if defined(HAVE_DLINFO)
-   struct link_map *map;
-   if (dlinfo(hdl, RTLD_DI_LINKMAP, &map) == -1) {
-     /* dlinfo failed; save the message in errmsg */
-     copyErrmsg(errmsg, dlerror());
-     goto dlinfo_fail;
-   }
-
-   hdl = NULL; // pass handle ownership to nc
-
-   struct piterate_cb_info piterate_info = {
-     .nc = nc,
-     .l_addr = (void *) map->l_addr
-   };
-   dl_iterate_phdr(loadNativeObjCb_, &piterate_info);
-   if (!nc->nc_ranges) {
-     copyErrmsg(errmsg, "dl_iterate_phdr failed to find obj");
-     goto dl_iterate_phdr_fail;
-   }
-   nc->unloadable = true;
-#else
-   nc->nc_ranges = NULL;
-   nc->unloadable = false;
-#endif /* defined (HAVE_DLINFO) */
-
-   insertOCSectionIndices(nc);
-
-   nc->next_loaded_object = loaded_objects;
-   loaded_objects = nc;
-
-   retval = nc->dlopen_handle;
-
-#if defined(PROFILING)
-  // collect any new cost centres that were defined in the loaded object.
-  refreshProfilingCCSs();
-#endif
-
-   goto success;
-
-dl_iterate_phdr_fail:
-   // already have dl_mutex
-   freeNativeCode_ELF(nc);
-dlinfo_fail:
-   if (hdl) dlclose(hdl);
-dlopen_fail:
-success:
-
-   RELEASE_LOCK(&dl_mutex);
-
-   IF_DEBUG(linker, debugBelch("loadNativeObj_ELF result=%p\n", retval));
-
-   return retval;
-}
-
-
 /*
  * PowerPC & X86_64 ELF specifics
  */
@@ -2303,4 +2121,73 @@ int ocAllocateExtras_ELF( ObjectCode *oc )
 
 #endif /* NEED_SYMBOL_EXTRAS */
 
+extern regex_t re_invalid;
+extern regex_t re_realso;
+
+void * loadNativeObjFromLinkerScript_ELF(char **errmsg)
+{
+   // GHC #2615
+   // On some systems (e.g., Gentoo Linux) dynamic files (e.g. libc.so)
+   // contain linker scripts rather than ELF-format object code. This
+   // code handles the situation by recognizing the real object code
+   // file name given in the linker script.
+   //
+   // If an "invalid ELF header" error occurs, it is assumed that the
+   // .so file contains a linker script instead of ELF object code.
+   // In this case, the code looks for the GROUP ( ... ) linker
+   // directive. If one is found, the first file name inside the
+   // parentheses is treated as the name of a dynamic library and the
+   // code attempts to dlopen that file. If this is also unsuccessful,
+   // an error message is returned.
+
+#define NMATCH 5
+   regmatch_t match[NMATCH];
+   FILE* fp;
+   size_t match_length;
+#define MAXLINE 1000
+   char line[MAXLINE];
+   int result;
+   void* r = NULL;
+
+   ASSERT_LOCK_HELD(&linker_mutex)
+
+   // see if the error message is due to an invalid ELF header
+   IF_DEBUG(linker, debugBelch("errmsg = '%s'\n", *errmsg));
+   result = regexec(&re_invalid, *errmsg, (size_t) NMATCH, match, 0);
+   IF_DEBUG(linker, debugBelch("result = %i\n", result));
+   if (result == 0) {
+      // success -- try to read the named file as a linker script
+      match_length = (size_t) stg_min((match[1].rm_eo - match[1].rm_so),
+                                 MAXLINE-1);
+      strncpy(line, (*errmsg+(match[1].rm_so)),match_length);
+      line[match_length] = '\0'; // make sure string is null-terminated
+      IF_DEBUG(linker, debugBelch("file name = '%s'\n", line));
+      if ((fp = __rts_fopen(line, "r")) == NULL) {
+         // return original error if open fails
+         return NULL;
+      }
+      // try to find a GROUP or INPUT ( ... ) command
+      while (fgets(line, MAXLINE, fp) != NULL) {
+         IF_DEBUG(linker, debugBelch("input line = %s", line));
+         if (regexec(&re_realso, line, (size_t) NMATCH, match, 0) == 0) {
+            // success -- try to dlopen the first named file
+            IF_DEBUG(linker, debugBelch("match%s\n",""));
+            line[match[2].rm_eo] = '\0';
+            stgFree((void*)*errmsg); // Free old message before creating new one
+            // ROMES:TODO: NO GOOD, this is almost recursive? no, as long we
+            // move the loadNativeObj_ELF to a shared impl
+            r = loadNativeObj_POSIX(line+match[2].rm_so, errmsg);
+            break;
+         }
+         // if control reaches here, no GROUP or INPUT ( ... ) directive
+         // was found and the original error message is returned to the
+         // caller
+      }
+      fclose(fp);
+   }
+
+   return r;
+}
+
+
 #endif /* elf */


=====================================
rts/linker/Elf.h
=====================================
@@ -14,7 +14,6 @@ int ocResolve_ELF        ( ObjectCode* oc );
 int ocRunInit_ELF        ( ObjectCode* oc );
 int ocRunFini_ELF        ( ObjectCode* oc );
 int ocAllocateExtras_ELF ( ObjectCode *oc );
-void freeNativeCode_ELF  ( ObjectCode *nc );
-void *loadNativeObj_ELF  ( pathchar *path, char **errmsg );
+void *loadNativeObjFromLinkerScript_ELF( char **errmsg );
 
 #include "EndPrivate.h"


=====================================
rts/linker/LoadNativeObjPosix.c
=====================================
@@ -0,0 +1,211 @@
+#include "CheckUnload.h"
+#include "ForeignExports.h"
+#include "LinkerInternals.h"
+#include "Rts.h"
+#include "RtsUtils.h"
+#include "Profiling.h"
+
+#include "linker/LoadNativeObjPosix.h"
+
+#if defined(HAVE_DLFCN_H)
+#include <dlfcn.h>
+#endif
+
+#if defined(HAVE_DLINFO)
+#include <link.h>
+#endif
+
+#include <string.h>
+
+/*
+ * Shared object loading
+ */
+
+#if defined(HAVE_DLINFO)
+struct piterate_cb_info {
+  ObjectCode *nc;
+  void *l_addr;   /* base virtual address of the loaded code */
+};
+
+static int loadNativeObjCb_(struct dl_phdr_info *info,
+    size_t _size STG_UNUSED, void *data) {
+  struct piterate_cb_info *s = (struct piterate_cb_info *) data;
+
+  // This logic mimicks _dl_addr_inside_object from glibc
+  // For reference:
+  // int
+  // internal_function
+  // _dl_addr_inside_object (struct link_map *l, const ElfW(Addr) addr)
+  // {
+  //   int n = l->l_phnum;
+  //   const ElfW(Addr) reladdr = addr - l->l_addr;
+  //
+  //   while (--n >= 0)
+  //     if (l->l_phdr[n].p_type == PT_LOAD
+  //         && reladdr - l->l_phdr[n].p_vaddr >= 0
+  //         && reladdr - l->l_phdr[n].p_vaddr < l->l_phdr[n].p_memsz)
+  //       return 1;
+  //   return 0;
+  // }
+
+  if ((void*) info->dlpi_addr == s->l_addr) {
+    int n = info->dlpi_phnum;
+    while (--n >= 0) {
+      if (info->dlpi_phdr[n].p_type == PT_LOAD) {
+        NativeCodeRange* ncr =
+          stgMallocBytes(sizeof(NativeCodeRange), "loadNativeObjCb_");
+        ncr->start = (void*) ((char*) s->l_addr + info->dlpi_phdr[n].p_vaddr);
+        ncr->end = (void*) ((char*) ncr->start + info->dlpi_phdr[n].p_memsz);
+
+        ncr->next = s->nc->nc_ranges;
+        s->nc->nc_ranges = ncr;
+      }
+    }
+  }
+  return 0;
+}
+#endif /* defined(HAVE_DLINFO) */
+
+static void copyErrmsg(char** errmsg_dest, char* errmsg) {
+  if (errmsg == NULL) errmsg = "loadNativeObj_POSIX: unknown error";
+  *errmsg_dest = stgMallocBytes(strlen(errmsg)+1, "loadNativeObj_POSIX");
+  strcpy(*errmsg_dest, errmsg);
+}
+
+void freeNativeCode_POSIX (ObjectCode *nc) {
+  ASSERT_LOCK_HELD(&linker_mutex);
+
+  dlclose(nc->dlopen_handle);
+
+  NativeCodeRange *ncr = nc->nc_ranges;
+  while (ncr) {
+    NativeCodeRange* last_ncr = ncr;
+    ncr = ncr->next;
+    stgFree(last_ncr);
+  }
+}
+
+void * loadNativeObj_POSIX (pathchar *path, char **errmsg)
+{
+   ObjectCode* nc;
+   void *hdl, *retval;
+
+   ASSERT_LOCK_HELD(&linker_mutex);
+
+   IF_DEBUG(linker, debugBelch("loadNativeObj_POSIX %" PATH_FMT "\n", path));
+
+   retval = NULL;
+
+
+   /* If we load the same object multiple times, just return the
+    * already-loaded handle. Note that this is broken if unloadNativeObj
+    * is used, as we don’t do any reference counting; see #24345.
+    */
+   ObjectCode *existing_oc = lookupObjectByPath(path);
+   if (existing_oc && existing_oc->status != OBJECT_UNLOADED) {
+     if (existing_oc->type == DYNAMIC_OBJECT) {
+       retval = existing_oc->dlopen_handle;
+       goto success;
+     }
+     copyErrmsg(errmsg, "loadNativeObj_POSIX: already loaded as non-dynamic object");
+     goto dlopen_fail;
+   }
+
+   nc = mkOc(DYNAMIC_OBJECT, path, NULL, 0, false, NULL, 0);
+
+   foreignExportsLoadingObject(nc);
+
+   // When dlopen() loads a profiled dynamic library, it calls the ctors which
+   // will call registerCcsList() to append the defined CostCentreStacks to
+   // CCS_LIST. However, another thread may be doing other things with the RTS
+   // linker that transitively calls refreshProfilingCCSs() which also accesses
+   // CCS_LIST. So there's a risk of data race that may lead to segfaults
+   // (#24423), and we need to ensure the ctors are also protected by
+   // ccs_mutex.
+#if defined(PROFILING)
+   ACQUIRE_LOCK(&ccs_mutex);
+#endif
+
+   // If we HAVE_DLINFO, we use RTLD_NOW rather than RTLD_LAZY because we want
+   // to learn eagerly about all external functions. Otherwise, there is no
+   // additional advantage to being eager, so it is better to be lazy and only bind
+   // functions when needed for better performance.
+   int dlopen_mode;
+#if defined(HAVE_DLINFO)
+   dlopen_mode = RTLD_NOW;
+#else
+   dlopen_mode = RTLD_LAZY;
+#endif
+
+   hdl = dlopen(path, dlopen_mode|RTLD_LOCAL); /* see Note [RTLD_LOCAL] */
+   nc->dlopen_handle = hdl;
+   nc->status = OBJECT_READY;
+
+#if defined(PROFILING)
+   RELEASE_LOCK(&ccs_mutex);
+#endif
+
+   foreignExportsFinishedLoadingObject();
+
+   if (hdl == NULL) {
+     /* dlopen failed; save the message in errmsg */
+     copyErrmsg(errmsg, dlerror());
+     goto dlopen_fail;
+   }
+
+#if defined(HAVE_DLINFO)
+   struct link_map *map;
+   if (dlinfo(hdl, RTLD_DI_LINKMAP, &map) == -1) {
+     /* dlinfo failed; save the message in errmsg */
+     copyErrmsg(errmsg, dlerror());
+     goto dlinfo_fail;
+   }
+
+   hdl = NULL; // pass handle ownership to nc
+
+   struct piterate_cb_info piterate_info = {
+     .nc = nc,
+     .l_addr = (void *) map->l_addr
+   };
+   dl_iterate_phdr(loadNativeObjCb_, &piterate_info);
+   if (!nc->nc_ranges) {
+     copyErrmsg(errmsg, "dl_iterate_phdr failed to find obj");
+     goto dl_iterate_phdr_fail;
+   }
+   nc->unloadable = true;
+#else
+   nc->nc_ranges = NULL;
+   nc->unloadable = false;
+#endif /* defined (HAVE_DLINFO) */
+
+   insertOCSectionIndices(nc);
+
+   nc->next_loaded_object = loaded_objects;
+   loaded_objects = nc;
+
+   retval = nc->dlopen_handle;
+
+#if defined(PROFILING)
+  // collect any new cost centres that were defined in the loaded object.
+  refreshProfilingCCSs();
+#endif
+
+   goto success;
+
+#if defined(HAVE_DLINFO)
+dl_iterate_phdr_fail:
+#endif
+   freeNativeCode_POSIX(nc);
+#if defined(HAVE_DLINFO)
+dlinfo_fail:
+#endif
+   if (hdl) dlclose(hdl);
+dlopen_fail:
+success:
+
+   IF_DEBUG(linker, debugBelch("loadNativeObj_POSIX result=%p\n", retval));
+
+   return retval;
+}
+
+


=====================================
rts/linker/LoadNativeObjPosix.h
=====================================
@@ -0,0 +1,11 @@
+#pragma once
+
+#include "Rts.h"
+#include "LinkerInternals.h"
+
+#include "BeginPrivate.h"
+
+void freeNativeCode_POSIX  ( ObjectCode *nc );
+void *loadNativeObj_POSIX  ( pathchar *path, char **errmsg );
+
+#include "EndPrivate.h"


=====================================
rts/rts.cabal
=====================================
@@ -458,6 +458,7 @@ library
                  linker/Elf.c
                  linker/InitFini.c
                  linker/LoadArchive.c
+                 linker/LoadNativeObjPosix.c
                  linker/M32Alloc.c
                  linker/MMap.c
                  linker/MachO.c



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/eb6a101f293d4d2cfb976a1a37cb32b32722f26e...ab309fc8032173bcfa993778a0cc73e0a34e641b

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/eb6a101f293d4d2cfb976a1a37cb32b32722f26e...ab309fc8032173bcfa993778a0cc73e0a34e641b
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20240401/ab92a453/attachment-0001.html>


More information about the ghc-commits mailing list