[Git][ghc/ghc][master] Link bytecode from interface-stored core bindings in oneshot mode

Marge Bot (@marge-bot) gitlab at gitlab.haskell.org
Fri Sep 27 10:12:45 UTC 2024



Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC


Commits:
2bb1e8df by Cheng Shao at 2024-09-27T06:11:35-04:00
Link bytecode from interface-stored core bindings in oneshot mode

!13042

Part of #T25090

If the flag `-fprefer-byte-code` is given when compiling a module
containing TH, GHC will use Core bindings stored in interfaces to
compile and link bytecode for splices.

This was only implemented for `--make` mode initially, so this commit
adds the same mechanism to oneshot mode (`-c`).

When an interface is loaded into the EPS in `loadInterface` that has
dehydrated Core bindings, an entry is added to the new field
`eps_iface_bytecode`, containing an IO action that produces a bytecode
`Linkable`, lazily processing the `mi_extra_decls` by calling
`loadIfaceByteCode`.

When Template Haskell dependencies are resolved in `getLinkDeps`, this
action is looked up after loading a module's interface.
If it exists, the action is evaluated and the bytecode is added to the
set of `Linkable`s used for execution of the splice; otherwise it falls
back on the traditional object file.

Metric Decrease:
    MultiLayerModules
    T13701

- - - - -


21 changed files:

- compiler/GHC/CoreToIface.hs
- compiler/GHC/Driver/CodeOutput.hs
- compiler/GHC/Driver/Main.hs
- + compiler/GHC/Driver/Main.hs-boot
- compiler/GHC/Driver/Make.hs
- compiler/GHC/Iface/Load.hs
- compiler/GHC/IfaceToCore.hs
- compiler/GHC/Linker/Deps.hs
- compiler/GHC/Linker/Loader.hs
- compiler/GHC/Unit/External.hs
- compiler/GHC/Unit/Finder.hs
- compiler/GHC/Unit/Module/WholeCoreBindings.hs
- + testsuite/tests/bytecode/T25090/A.hs
- + testsuite/tests/bytecode/T25090/B.hs
- + testsuite/tests/bytecode/T25090/C.hs
- + testsuite/tests/bytecode/T25090/C.hs-boot
- + testsuite/tests/bytecode/T25090/D.hs
- + testsuite/tests/bytecode/T25090/Makefile
- + testsuite/tests/bytecode/T25090/T25090-debug.stderr
- + testsuite/tests/bytecode/T25090/T25090.stdout
- + testsuite/tests/bytecode/T25090/all.T


Changes:

=====================================
compiler/GHC/CoreToIface.hs
=====================================
@@ -599,7 +599,7 @@ toIfaceTopBind b =
           in (top_bndr, rhs')
 
         -- The sharing behaviour is currently disabled due to #22807, and relies on
-        -- finished #220056 to be re-enabled.
+        -- finished #20056 to be re-enabled.
         disabledDueTo22807 = True
 
         already_has_unfolding b = not disabledDueTo22807
@@ -771,8 +771,8 @@ outside of the hs-boot loop.
 Note [Interface File with Core: Sharing RHSs]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-IMPORTANT: This optimisation is currently disabled due to #22027, it can be
-           re-enabled once #220056 is implemented.
+IMPORTANT: This optimisation is currently disabled due to #22807, it can be
+           re-enabled once #22056 is implemented.
 
 In order to avoid duplicating definitions for bindings which already have unfoldings
 we do some minor headstands to avoid serialising the RHS of a definition if it has


=====================================
compiler/GHC/Driver/CodeOutput.hs
=====================================
@@ -275,7 +275,6 @@ outputForeignStubs
            Maybe FilePath) -- C file created
 outputForeignStubs logger tmpfs dflags unit_state mod location stubs
  = do
-   let stub_h = unsafeDecodeUtf $ mkStubPaths (initFinderOpts dflags) (moduleName mod) location
    stub_c <- newTempName logger tmpfs (tmpDir dflags) TFL_CurrentModule "c"
 
    case stubs of
@@ -291,8 +290,6 @@ outputForeignStubs logger tmpfs dflags unit_state mod location stubs
             stub_h_output_d = pprCode h_code
             stub_h_output_w = showSDoc dflags stub_h_output_d
 
-        createDirectoryIfMissing True (takeDirectory stub_h)
-
         putDumpFileMaybe logger Opt_D_dump_foreign
                       "Foreign export header file"
                       FormatC
@@ -314,9 +311,23 @@ outputForeignStubs logger tmpfs dflags unit_state mod location stubs
               | platformMisc_libFFI $ platformMisc dflags = "#include \"rts/ghc_ffi.h\"\n"
               | otherwise = ""
 
-        stub_h_file_exists
-           <- outputForeignStubs_help stub_h stub_h_output_w
-                ("#include <HsFFI.h>\n" ++ cplusplus_hdr) cplusplus_ftr
+        -- The header path is computed from the module source path, which
+        -- does not exist when loading interface core bindings for Template
+        -- Haskell for non-home modules (e.g. when compiling in separate
+        -- invocations of oneshot mode).
+        -- Stub headers are only generated for foreign exports.
+        -- Since those aren't supported for TH with bytecode at the moment,
+        -- it doesn't make much of a difference.
+        -- In any case, if a stub dir was specified explicitly by the user, it
+        -- would be used nonetheless.
+        stub_h_file_exists <-
+          case mkStubPaths (initFinderOpts dflags) (moduleName mod) location of
+            Nothing -> pure False
+            Just path -> do
+              let stub_h = unsafeDecodeUtf path
+              createDirectoryIfMissing True (takeDirectory stub_h)
+              outputForeignStubs_help stub_h stub_h_output_w
+                    ("#include <HsFFI.h>\n" ++ cplusplus_hdr) cplusplus_ftr
 
         putDumpFileMaybe logger Opt_D_dump_foreign
                       "Foreign export stubs" FormatC stub_c_output_d


=====================================
compiler/GHC/Driver/Main.hs
=====================================
@@ -49,6 +49,7 @@ module GHC.Driver.Main
     , HscBackendAction (..), HscRecompStatus (..)
     , initModDetails
     , initWholeCoreBindings
+    , loadIfaceByteCode
     , hscMaybeWriteIface
     , hscCompileCmmFile
 
@@ -105,6 +106,7 @@ module GHC.Driver.Main
     , showModuleIndex
     , hscAddSptEntries
     , writeInterfaceOnlyMode
+    , loadByteCode
     ) where
 
 import GHC.Prelude
@@ -275,7 +277,8 @@ import GHC.SysTools (initSysTools)
 import GHC.SysTools.BaseDir (findTopDir)
 
 import Data.Data hiding (Fixity, TyCon)
-import Data.List        ( nub, isPrefixOf, partition )
+import Data.Functor ((<&>))
+import Data.List ( nub, isPrefixOf, partition )
 import qualified Data.List.NonEmpty as NE
 import Control.Monad
 import Data.IORef
@@ -972,19 +975,23 @@ loadByteCode iface mod_sum = do
                    (mi_foreign iface)
           return (UpToDateItem (Linkable if_date this_mod (NE.singleton (CoreBindings fi))))
       _ -> return $ outOfDateItemBecause MissingBytecode Nothing
+
 --------------------------------------------------------------
 -- Compilers
 --------------------------------------------------------------
 
+add_iface_to_hpt :: ModIface -> ModDetails -> HscEnv -> HscEnv
+add_iface_to_hpt iface details =
+  hscUpdateHPT $ \ hpt ->
+    addToHpt hpt (moduleName (mi_module iface))
+    (HomeModInfo iface details emptyHomeModInfoLinkable)
 
 -- Knot tying!  See Note [Knot-tying typecheckIface]
 -- See Note [ModDetails and --make mode]
 initModDetails :: HscEnv -> ModIface -> IO ModDetails
 initModDetails hsc_env iface =
   fixIO $ \details' -> do
-    let act hpt  = addToHpt hpt (moduleName $ mi_module iface)
-                                (HomeModInfo iface details' emptyHomeModInfoLinkable)
-    let !hsc_env' = hscUpdateHPT act hsc_env
+    let !hsc_env' = add_iface_to_hpt iface details' hsc_env
     -- NB: This result is actually not that useful
     -- in one-shot mode, since we're not going to do
     -- any further typechecking.  It's much more useful
@@ -1012,8 +1019,52 @@ compile_for_interpreter hsc_env use =
 
     adapt_way want = if want (hscInterp hsc_env) then addWay else removeWay
 
+-- | Assemble 'WholeCoreBindings' if the interface contains Core bindings.
+iface_core_bindings :: ModIface -> ModLocation -> Maybe WholeCoreBindings
+iface_core_bindings iface wcb_mod_location =
+  mi_extra_decls <&> \ wcb_bindings ->
+    WholeCoreBindings {
+      wcb_bindings,
+      wcb_module = mi_module,
+      wcb_mod_location,
+      wcb_foreign = mi_foreign
+    }
+  where
+    ModIface {mi_module, mi_extra_decls, mi_foreign} = iface
+
+-- | Return an 'IO' that hydrates Core bindings and compiles them to bytecode if
+-- the interface contains any, using the supplied type env for typechecking.
+--
+-- Unlike 'initWholeCoreBindings', this does not use lazy IO.
+-- Instead, the 'IO' is only evaluated (in @get_link_deps@) when it is clear
+-- that it will be used immediately (because we're linking TH with
+-- @-fprefer-byte-code@ in oneshot mode), and the result is cached in
+-- 'LoaderState'.
+--
+-- 'initWholeCoreBindings' needs the laziness because it is used to populate
+-- 'HomeModInfo', which is done preemptively, in anticipation of downstream
+-- modules using the bytecode for TH in make mode, which might never happen.
+loadIfaceByteCode ::
+  HscEnv ->
+  ModIface ->
+  ModLocation ->
+  TypeEnv ->
+  Maybe (IO Linkable)
+loadIfaceByteCode hsc_env iface location type_env =
+  compile <$> iface_core_bindings iface location
+  where
+    compile decls = do
+      (bcos, fos) <- compileWholeCoreBindings hsc_env type_env decls
+      linkable $ BCOs bcos :| [DotO fo ForeignObject | fo <- fos]
+
+    linkable parts = do
+      if_time <- modificationTimeIfExists (ml_hi_file location)
+      time <- maybe getCurrentTime pure if_time
+      return $! Linkable time (mi_module iface) parts
+
 -- | If the 'Linkable' contains Core bindings loaded from an interface, replace
--- them with a lazy IO thunk that compiles them to bytecode and foreign objects.
+-- them with a lazy IO thunk that compiles them to bytecode and foreign objects,
+-- using the supplied environment for type checking.
 --
 -- The laziness is necessary because this value is stored purely in a
 -- 'HomeModLinkable' in the home package table, rather than some dedicated
@@ -1027,29 +1078,71 @@ compile_for_interpreter hsc_env use =
 --
 -- This is sound because generateByteCode just depends on things already loaded
 -- in the interface file.
-initWholeCoreBindings :: HscEnv -> ModIface -> ModDetails -> Linkable -> IO Linkable
-initWholeCoreBindings hsc_env mod_iface details (Linkable utc_time this_mod uls) =
+initWholeCoreBindings ::
+  HscEnv ->
+  ModIface ->
+  ModDetails ->
+  Linkable ->
+  IO Linkable
+initWholeCoreBindings hsc_env iface details (Linkable utc_time this_mod uls) =
   Linkable utc_time this_mod <$> mapM go uls
   where
-    go (CoreBindings wcb at WholeCoreBindings {wcb_foreign, wcb_mod_location}) = do
-        types_var <- newIORef (md_types details)
-        let act hpt = addToHpt hpt (moduleName $ mi_module mod_iface)
-                      (HomeModInfo mod_iface details emptyHomeModInfoLinkable)
-            kv = knotVarsFromModuleEnv (mkModuleEnv [(this_mod, types_var)])
-            hsc_env' = hscUpdateHPT act hsc_env { hsc_type_env_vars = kv }
-        ~(bcos, fos) <- unsafeInterleaveIO $ do
-          core_binds <- initIfaceCheck (text "l") hsc_env' $
-                        typecheckWholeCoreBindings types_var wcb
-          (stubs, foreign_files) <-
-            decodeIfaceForeign logger (hsc_tmpfs hsc_env)
-            (tmpDir (hsc_dflags hsc_env)) wcb_foreign
-          let cgi_guts = CgInteractiveGuts this_mod core_binds
-                         (typeEnvTyCons (md_types details)) stubs foreign_files
-                         Nothing []
-          trace_if logger (text "Generating ByteCode for" <+> ppr this_mod)
-          generateByteCode hsc_env cgi_guts wcb_mod_location
-        pure (LazyBCOs bcos fos)
-    go ul = return ul
+    go = \case
+      CoreBindings wcb -> do
+        ~(bco, fos) <- unsafeInterleaveIO $
+                       compileWholeCoreBindings hsc_env' type_env wcb
+        pure (LazyBCOs bco fos)
+      l -> pure l
+
+    hsc_env' = add_iface_to_hpt iface details hsc_env
+    type_env = md_types details
+
+-- | Hydrate interface Core bindings and compile them to bytecode.
+--
+-- This consists of:
+--
+-- 1. Running a typechecking step to insert the global names that were removed
+--    when the interface was written or were unavailable due to boot import
+--    cycles, converting the bindings to 'CoreBind'.
+--
+-- 2. Restoring the foreign build inputs from their serialized format, resulting
+--    in a set of foreign import stubs and source files added via
+--    'qAddForeignFilePath'.
+--
+-- 3. Generating bytecode and foreign objects from the results of the previous
+--    steps using the usual pipeline actions.
+compileWholeCoreBindings ::
+  HscEnv ->
+  TypeEnv ->
+  WholeCoreBindings ->
+  IO (CompiledByteCode, [FilePath])
+compileWholeCoreBindings hsc_env type_env wcb = do
+  core_binds <- typecheck
+  (stubs, foreign_files) <- decode_foreign
+  gen_bytecode core_binds stubs foreign_files
+  where
+    typecheck = do
+      types_var <- newIORef type_env
+      let
+        tc_env = hsc_env {
+          hsc_type_env_vars =
+            knotVarsFromModuleEnv (mkModuleEnv [(wcb_module, types_var)])
+        }
+      initIfaceCheck (text "l") tc_env $
+        typecheckWholeCoreBindings types_var wcb
+
+    decode_foreign =
+      decodeIfaceForeign logger (hsc_tmpfs hsc_env)
+      (tmpDir (hsc_dflags hsc_env)) wcb_foreign
+
+    gen_bytecode core_binds stubs foreign_files = do
+      let cgi_guts = CgInteractiveGuts wcb_module core_binds
+                      (typeEnvTyCons type_env) stubs foreign_files
+                      Nothing []
+      trace_if logger (text "Generating ByteCode for" <+> ppr wcb_module)
+      generateByteCode hsc_env cgi_guts wcb_mod_location
+
+    WholeCoreBindings {wcb_module, wcb_mod_location, wcb_foreign} = wcb
 
     logger = hsc_logger hsc_env
 


=====================================
compiler/GHC/Driver/Main.hs-boot
=====================================
@@ -0,0 +1,15 @@
+module GHC.Driver.Main where
+
+import GHC.Driver.Env.Types (HscEnv)
+import GHC.Linker.Types (Linkable)
+import GHC.Prelude.Basic
+import GHC.Types.TypeEnv (TypeEnv)
+import GHC.Unit.Module.Location (ModLocation)
+import GHC.Unit.Module.ModIface (ModIface)
+
+loadIfaceByteCode ::
+  HscEnv ->
+  ModIface ->
+  ModLocation ->
+  TypeEnv ->
+  Maybe (IO Linkable)


=====================================
compiler/GHC/Driver/Make.hs
=====================================
@@ -1310,8 +1310,10 @@ upsweep_mod hsc_env mHscMessage old_hmi summary mod_index nmods =  do
   -- MP: This is a bit janky, because before you add the entries you have to extend the HPT with the module
   -- you just compiled. Another option, would be delay adding anything until after upsweep has finished, but I
   -- am unsure if this is sound (wrt running TH splices for example).
-  -- This function only does anything if the linkable produced is a BCO, which only happens with the
-  -- bytecode backend, no need to guard against the backend type additionally.
+  -- This function only does anything if the linkable produced is a BCO, which
+  -- used to only happen with the bytecode backend, but with
+  -- @-fprefer-byte-code@, @HomeModInfo@ has bytecode even when generating
+  -- object code, see #25230.
   addSptEntries (hscUpdateHPT (\hpt -> addToHpt hpt (ms_mod_name summary) hmi) hsc_env)
                 (homeModInfoByteCode hmi)
 
@@ -3007,7 +3009,7 @@ which can be checked easily using ghc-debug.
         a reference to the entire HscEnv, if we are not careful the HscEnv will
         contain the HomePackageTable at the time the interface was loaded and
         it will never be released.
-   Where? dontLeakTheHPT in GHC.Iface.Load
+   Where? dontLeakTheHUG in GHC.Iface.Load
 
 2. No KnotVars are live at the end of upsweep (#20491)
    Why? KnotVars contains an old stale reference to the TypeEnv for modules


=====================================
compiler/GHC/Iface/Load.hs
=====================================
@@ -115,6 +115,7 @@ import Data.Map ( toList )
 import System.FilePath
 import System.Directory
 import GHC.Driver.Env.KnotVars
+import {-# source #-} GHC.Driver.Main (loadIfaceByteCode)
 import GHC.Iface.Errors.Types
 import Data.Function ((&))
 
@@ -474,7 +475,7 @@ loadInterface doc_str mod from
         -- Template Haskell original-name).
             Succeeded (iface, loc) ->
         let
-            loc_doc = text loc
+            loc_doc = text (ml_hi_file loc)
         in
         initIfaceLcl (mi_semantic_module iface) loc_doc (mi_boot iface) $
 
@@ -511,6 +512,7 @@ loadInterface doc_str mod from
         ; new_eps_rules     <- tcIfaceRules ignore_prags (mi_rules iface)
         ; new_eps_anns      <- tcIfaceAnnotations (mi_anns iface)
         ; new_eps_complete_matches <- tcIfaceCompleteMatches (mi_complete_matches iface)
+        ; purged_hsc_env <- getTopEnv
 
         ; let final_iface = iface
                                & set_mi_decls     (panic "No mi_decls in PIT")
@@ -518,13 +520,26 @@ loadInterface doc_str mod from
                                & set_mi_fam_insts (panic "No mi_fam_insts in PIT")
                                & set_mi_rules     (panic "No mi_rules in PIT")
                                & set_mi_anns      (panic "No mi_anns in PIT")
+                               & set_mi_extra_decls (panic "No mi_extra_decls in PIT")
 
-        ; let bad_boot = mi_boot iface == IsBoot
+              bad_boot = mi_boot iface == IsBoot
                           && isJust (lookupKnotVars (if_rec_types gbl_env) mod)
                             -- Warn against an EPS-updating import
                             -- of one's own boot file! (one-shot only)
                             -- See Note [Loading your own hi-boot file]
 
+              -- Create an IO action that loads and compiles bytecode from Core
+              -- bindings.
+              --
+              -- See Note [Interface Files with Core Definitions]
+              add_bytecode old
+                | Just action <- loadIfaceByteCode purged_hsc_env iface loc (mkNameEnv new_eps_decls)
+                = extendModuleEnv old mod action
+                -- Don't add an entry if the iface doesn't have 'extra_decls'
+                -- so 'get_link_deps' knows that it should load object code.
+                | otherwise
+                = old
+
         ; warnPprTrace bad_boot "loadInterface" (ppr mod) $
           updateEps_  $ \ eps ->
            if elemModuleEnv mod (eps_PIT eps) || is_external_sig mhome_unit iface
@@ -536,6 +551,7 @@ loadInterface doc_str mod from
                 eps {
                   eps_PIT          = extendModuleEnv (eps_PIT eps) mod final_iface,
                   eps_PTE          = addDeclsToPTE   (eps_PTE eps) new_eps_decls,
+                  eps_iface_bytecode = add_bytecode (eps_iface_bytecode eps),
                   eps_rule_base    = extendRuleBaseList (eps_rule_base eps)
                                                         new_eps_rules,
                   eps_complete_matches
@@ -569,7 +585,7 @@ loadInterface doc_str mod from
 {- Note [Loading your own hi-boot file]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Generally speaking, when compiling module M, we should not
-load M.hi boot into the EPS.  After all, we are very shortly
+load M.hi-boot into the EPS.  After all, we are very shortly
 going to have full information about M.  Moreover, see
 Note [Do not update EPS with your own hi-boot] in GHC.Iface.Recomp.
 
@@ -698,7 +714,7 @@ computeInterface
   -> SDoc
   -> IsBootInterface
   -> Module
-  -> IO (MaybeErr MissingInterfaceError (ModIface, FilePath))
+  -> IO (MaybeErr MissingInterfaceError (ModIface, ModLocation))
 computeInterface hsc_env doc_str hi_boot_file mod0 = do
   massert (not (isHoleModule mod0))
   let mhome_unit  = hsc_home_unit_maybe hsc_env
@@ -845,7 +861,7 @@ findAndReadIface
                      -- this to check the consistency of the requirements of the
                      -- module we read out.
   -> IsBootInterface -- ^ Looking for .hi-boot or .hi file
-  -> IO (MaybeErr MissingInterfaceError (ModIface, FilePath))
+  -> IO (MaybeErr MissingInterfaceError (ModIface, ModLocation))
 findAndReadIface hsc_env doc_str mod wanted_mod hi_boot_file = do
 
   let profile = targetProfile dflags
@@ -875,7 +891,7 @@ findAndReadIface hsc_env doc_str mod wanted_mod hi_boot_file = do
           let iface = case ghcPrimIfaceHook hooks of
                        Nothing -> ghcPrimIface
                        Just h  -> h
-          return (Succeeded (iface, "<built in interface for GHC.Prim>"))
+          return (Succeeded (iface, panic "GHC.Prim ModLocation (findAndReadIface)"))
       else do
           let fopts = initFinderOpts dflags
           -- Look for the file
@@ -900,7 +916,7 @@ findAndReadIface hsc_env doc_str mod wanted_mod hi_boot_file = do
                                                          iface loc
                                 case r2 of
                                   Failed sdoc -> return (Failed sdoc)
-                                  Succeeded {} -> return $ Succeeded (iface,_fp)
+                                  Succeeded {} -> return $ Succeeded (iface, loc)
               err -> do
                   trace_if logger (text "...not found")
                   return $ Failed $ cannotFindInterface


=====================================
compiler/GHC/IfaceToCore.hs
=====================================
@@ -904,11 +904,11 @@ tcTopIfaceBindings :: IORef TypeEnv -> [IfaceBindingX IfaceMaybeRhs IfaceTopBndr
           -> IfL [CoreBind]
 tcTopIfaceBindings ty_var ver_decls
    = do
-      int <- mapM tcTopBinders  ver_decls
+      int <- mapM tcTopBinders ver_decls
       let all_ids :: [Id] = concatMap toList int
       liftIO $ modifyIORef ty_var (flip extendTypeEnvList (map AnId all_ids))
 
-      extendIfaceIdEnv all_ids $ mapM (tc_iface_bindings) int
+      extendIfaceIdEnv all_ids $ mapM tc_iface_bindings int
 
 tcTopBinders :: IfaceBindingX a IfaceTopBndrInfo -> IfL (IfaceBindingX a Id)
 tcTopBinders = traverse mk_top_id


=====================================
compiler/GHC/Linker/Deps.hs
=====================================
@@ -60,16 +60,16 @@ import System.Directory
 data LinkDepsOpts = LinkDepsOpts
   { ldObjSuffix   :: !String                        -- ^ Suffix of .o files
   , ldOneShotMode :: !Bool                          -- ^ Is the driver in one-shot mode?
-  , ldModuleGraph :: !ModuleGraph                   -- ^ Module graph
-  , ldUnitEnv     :: !UnitEnv                       -- ^ Unit environment
+  , ldModuleGraph :: !ModuleGraph
+  , ldUnitEnv     :: !UnitEnv
   , ldPprOpts     :: !SDocContext                   -- ^ Rendering options for error messages
-  , ldFinderCache :: !FinderCache                   -- ^ Finder cache
-  , ldFinderOpts  :: !FinderOpts                    -- ^ Finder options
   , ldUseByteCode :: !Bool                          -- ^ Use bytecode rather than objects
   , ldMsgOpts     :: !(DiagnosticOpts IfaceMessage) -- ^ Options for diagnostics
   , ldWays        :: !Ways                          -- ^ Enabled ways
-  , ldLoadIface   :: SDoc -> Module -> IO (MaybeErr MissingInterfaceError ModIface)
-                                                    -- ^ Interface loader function
+  , ldFinderCache :: !FinderCache
+  , ldFinderOpts  :: !FinderOpts
+  , ldLoadIface   :: !(SDoc -> Module -> IO (MaybeErr MissingInterfaceError ModIface))
+  , ldLoadByteCode :: !(Module -> IO (Maybe Linkable))
   }
 
 data LinkDeps = LinkDeps
@@ -275,21 +275,21 @@ get_link_deps opts pls maybe_normal_osuf span mods = do
              case ue_homeUnit unit_env of
               Nothing -> no_obj mod
               Just home_unit -> do
-
-                let fc = ldFinderCache opts
-                let fopts = ldFinderOpts opts
-                mb_stuff <- findHomeModule fc fopts home_unit (moduleName mod)
-                case mb_stuff of
-                  Found loc mod -> found loc mod
-                  _ -> no_obj (moduleName mod)
+                from_bc <- ldLoadByteCode opts mod
+                maybe (fallback_no_bytecode home_unit mod) pure from_bc
         where
-            found loc mod = do {
-                -- ...and then find the linkable for it
-               mb_lnk <- findObjectLinkableMaybe mod loc ;
-               case mb_lnk of {
-                  Nothing  -> no_obj mod ;
-                  Just lnk -> adjust_linkable lnk
-              }}
+
+            fallback_no_bytecode home_unit mod = do
+              let fc = ldFinderCache opts
+              let fopts = ldFinderOpts opts
+              mb_stuff <- findHomeModule fc fopts home_unit (moduleName mod)
+              case mb_stuff of
+                Found loc _ -> do
+                  mb_lnk <- findObjectLinkableMaybe mod loc
+                  case mb_lnk of
+                    Nothing  -> no_obj mod
+                    Just lnk -> adjust_linkable lnk
+                _ -> no_obj (moduleName mod)
 
             adjust_linkable lnk
                 | Just new_osuf <- maybe_normal_osuf = do


=====================================
compiler/GHC/Linker/Loader.hs
=====================================
@@ -76,6 +76,7 @@ import GHC.Utils.Logger
 import GHC.Utils.TmpFs
 
 import GHC.Unit.Env
+import GHC.Unit.External (ExternalPackageState (EPS, eps_iface_bytecode))
 import GHC.Unit.Module
 import GHC.Unit.State as Packages
 
@@ -641,18 +642,23 @@ initLinkDepsOpts hsc_env = opts
             , ldOneShotMode = isOneShot (ghcMode dflags)
             , ldModuleGraph = hsc_mod_graph hsc_env
             , ldUnitEnv     = hsc_unit_env hsc_env
-            , ldLoadIface   = load_iface
             , ldPprOpts     = initSDocContext dflags defaultUserStyle
             , ldFinderCache = hsc_FC hsc_env
             , ldFinderOpts  = initFinderOpts dflags
             , ldUseByteCode = gopt Opt_UseBytecodeRatherThanObjects dflags
             , ldMsgOpts     = initIfaceMessageOpts dflags
             , ldWays        = ways dflags
+            , ldLoadIface
+            , ldLoadByteCode
             }
     dflags = hsc_dflags hsc_env
-    load_iface msg mod = initIfaceCheck (text "loader") hsc_env
+    ldLoadIface msg mod = initIfaceCheck (text "loader") hsc_env
                           $ loadInterface msg mod (ImportByUser NotBoot)
 
+    ldLoadByteCode mod = do
+      EPS {eps_iface_bytecode} <- hscEPS hsc_env
+      sequence (lookupModuleEnv eps_iface_bytecode mod)
+
 
 
 {- **********************************************************************


=====================================
compiler/GHC/Unit/External.hs
=====================================
@@ -31,6 +31,8 @@ import GHC.Types.CompleteMatch
 import GHC.Types.TypeEnv
 import GHC.Types.Unique.DSet
 
+import GHC.Linker.Types (Linkable)
+
 import Data.IORef
 
 
@@ -68,6 +70,7 @@ initExternalPackageState = EPS
   , eps_PIT              = emptyPackageIfaceTable
   , eps_free_holes       = emptyInstalledModuleEnv
   , eps_PTE              = emptyTypeEnv
+  , eps_iface_bytecode   = emptyModuleEnv
   , eps_inst_env         = emptyInstEnv
   , eps_fam_inst_env     = emptyFamInstEnv
   , eps_rule_base        = mkRuleBase builtinRules
@@ -139,6 +142,12 @@ data ExternalPackageState
                 -- interface files we have sucked in. The domain of
                 -- the mapping is external-package modules
 
+        -- | If an interface was written with @-fwrite-if-simplified-core@, this
+        -- will contain an IO action that compiles bytecode from core bindings.
+        --
+        -- See Note [Interface Files with Core Definitions]
+        eps_iface_bytecode :: !(ModuleEnv (IO Linkable)),
+
         eps_inst_env     :: !PackageInstEnv,   -- ^ The total 'InstEnv' accumulated
                                                -- from all the external-package modules
         eps_fam_inst_env :: !PackageFamInstEnv,-- ^ The total 'FamInstEnv' accumulated


=====================================
compiler/GHC/Unit/Finder.hs
=====================================
@@ -38,7 +38,6 @@ import GHC.Platform.Ways
 
 import GHC.Builtin.Names ( gHC_PRIM )
 
-import GHC.Data.Maybe ( expectJust )
 import GHC.Data.OsPath
 
 import GHC.Unit.Env
@@ -60,6 +59,7 @@ import GHC.Types.PkgQual
 import GHC.Fingerprint
 import Data.IORef
 import System.Directory.OsPath
+import Control.Applicative ((<|>))
 import Control.Monad
 import Data.Time
 import qualified Data.Map as M
@@ -711,27 +711,27 @@ mkHiePath fopts basename mod_basename = hie_basename <.> hiesuf
 -- We don't have to store these in ModLocations, because they can be derived
 -- from other available information, and they're only rarely needed.
 
+-- | Compute the file name of a header file for foreign stubs, using either the
+-- directory explicitly specified in the command line option @-stubdir@, or the
+-- directory of the module's source file.
+--
+-- When compiling bytecode from interface Core bindings, @ModLocation@ does not
+-- contain a source file path, so the header isn't written.
+-- This doesn't have an impact, since we cannot support headers importing
+-- Haskell symbols defined in bytecode for TH whatsoever at the moment.
 mkStubPaths
   :: FinderOpts
   -> ModuleName
   -> ModLocation
-  -> OsPath
-
-mkStubPaths fopts mod location
-  = let
-        stubdir = finder_stubDir fopts
-
-        mod_basename = unsafeEncodeUtf $ moduleNameSlashes mod
-        src_basename = OsPath.dropExtension $ expectJust "mkStubPaths"
-                                                  (ml_hs_file_ospath location)
-
-        stub_basename0
-            | Just dir <- stubdir = dir </> mod_basename
-            | otherwise           = src_basename
+  -> Maybe OsPath
+mkStubPaths fopts mod location = do
+  stub_basename <- in_stub_dir <|> src_basename
+  pure (stub_basename `mappend` os "_stub" <.> os "h")
+  where
+    in_stub_dir = (</> mod_basename) <$> (finder_stubDir fopts)
 
-        stub_basename = stub_basename0 `mappend` os "_stub"
-     in
-        stub_basename <.> os "h"
+    mod_basename = unsafeEncodeUtf $ moduleNameSlashes mod
+    src_basename = OsPath.dropExtension <$> ml_hs_file_ospath location
 
 -- -----------------------------------------------------------------------------
 -- findLinkable isn't related to the other stuff in here,


=====================================
compiler/GHC/Unit/Module/WholeCoreBindings.hs
=====================================
@@ -28,12 +28,12 @@ import System.FilePath (takeExtension)
 
 {-
 Note [Interface Files with Core Definitions]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 A interface file can optionally contain the definitions of all core bindings, this
 is enabled by the flag `-fwrite-if-simplified-core`.
 This provides everything needed in addition to the normal ModIface and ModDetails
-to restart compilation after typechecking to generate bytecode. The `fi_bindings` field
+to restart compilation after typechecking to generate bytecode. The `wcb_bindings` field
 is stored in the normal interface file and the other fields populated whilst loading
 the interface file.
 
@@ -62,8 +62,55 @@ after whatever simplification the user requested has been performed. So the simp
 of the interface file agree with the optimisation level as reported by the interface
 file.
 
+The lifecycle differs beyond laziness depending on the provenance of a module.
+In all cases, the main consumer for interface bytecode is 'get_link_deps', which
+traverses a splice's or GHCi expression's dependencies and collects the needed
+build artifacts, which can be objects or bytecode, depending on the build
+settings.
+
+1. In make mode, all eligible modules are part of the dependency graph.
+   Their interfaces are loaded unconditionally and in dependency order by the
+   compilation manager, and each module's bytecode is prepared before its
+   dependents are compiled, in one of two ways:
+
+   - If the interface file for a module is missing or out of sync with its
+     source, it is recompiled and bytecode is generated directly and
+     immediately, not involving 'WholeCoreBindings' (in 'runHscBackendPhase').
+
+   - If the interface file is up to date, no compilation is performed, and a
+     lazy thunk generating bytecode from interface Core bindings is created in
+     'compileOne'', which will only be compiled if a downstream module contains
+     a splice that depends on it, as described above.
+
+   In both cases, the bytecode 'Linkable' is stored in a 'HomeModLinkable' in
+   the Home Unit Graph, lazy or not.
+
+2. In oneshot mode, which compiles individual modules without a shared home unit
+   graph, a previously compiled module is not reprocessed as described for make
+   mode above.
+   When 'get_link_deps' encounters a dependency on a local module, it requests
+   its bytecode from the External Package State, who loads the interface
+   on-demand.
+
+   Since the EPS stores interfaces for all package dependencies in addition to
+   local modules in oneshot mode, it has a substantial memory footprint.
+   We try to curtail that by extracting important data into specialized fields
+   in the EPS, and retaining only a few fields of 'ModIface' by overwriting the
+   others with bottom values.
+
+   In order to avoid keeping around all of the interface's components needed for
+   compiling bytecode, we instead store an IO action in 'eps_iface_bytecode'.
+   When 'get_link_deps' evaluates this action, the result is not retained in the
+   EPS, but stored in 'LoaderState', where it may eventually get evicted to free
+   up the memory.
+   This IO action retains the dehydrated Core bindings from the interface in its
+   closure.
+   Like the bytecode 'Linkable' stored in 'LoaderState', this is preferable to
+   storing the intermediate representation as rehydrated Core bindings, since
+   the latter have a significantly greater memory footprint.
+
 Note [Size of Interface Files with Core Definitions]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 How much overhead does `-fwrite-if-simplified-core` add to a typical interface file?
 As an experiment I compiled the `Cabal` library and `ghc` library (Aug 22) with


=====================================
testsuite/tests/bytecode/T25090/A.hs
=====================================
@@ -0,0 +1,7 @@
+{-# language TemplateHaskell #-}
+module Main where
+
+import D
+
+main :: IO ()
+main = putStrLn (show ($splc :: Int))


=====================================
testsuite/tests/bytecode/T25090/B.hs
=====================================
@@ -0,0 +1,5 @@
+module B where
+
+import {-# source #-} C (C)
+
+data B = B C


=====================================
testsuite/tests/bytecode/T25090/C.hs
=====================================
@@ -0,0 +1,8 @@
+module C where
+
+import B
+
+data C = C Int
+
+b :: B
+b = B (C 2024)


=====================================
testsuite/tests/bytecode/T25090/C.hs-boot
=====================================
@@ -0,0 +1,3 @@
+module C where
+
+data C


=====================================
testsuite/tests/bytecode/T25090/D.hs
=====================================
@@ -0,0 +1,12 @@
+module D where
+
+import Language.Haskell.TH (ExpQ)
+import Language.Haskell.TH.Syntax (lift)
+import B
+import C
+
+splc :: ExpQ
+splc =
+  lift @_ @Int num
+  where
+    B (C num) = b


=====================================
testsuite/tests/bytecode/T25090/Makefile
=====================================
@@ -0,0 +1,21 @@
+TOP=../../..
+include $(TOP)/mk/boilerplate.mk
+include $(TOP)/mk/test.mk
+
+# Verify that the object files aren't linked by clobbering them.
+T25090a:
+	$(TEST_HC) -c -fbyte-code-and-object-code C.hs-boot
+	$(TEST_HC) -c -fbyte-code-and-object-code B.hs
+	$(TEST_HC) -c -fbyte-code-and-object-code C.hs
+	echo 'corrupt' > B.o
+	echo 'corrupt' > C.o
+	echo 'corrupt' > C.o-boot
+	$(TEST_HC) -c -fbyte-code-and-object-code D.hs
+	echo 'corrupt' > D.o
+	$(TEST_HC) -c -fbyte-code-and-object-code -fprefer-byte-code A.hs
+	$(TEST_HC) -fbyte-code-and-object-code -fprefer-byte-code A.o -o exe
+	./exe
+
+T25090b:
+	$(TEST_HC) -fbyte-code-and-object-code -fprefer-byte-code A -o exe -v0
+	./exe


=====================================
testsuite/tests/bytecode/T25090/T25090-debug.stderr
=====================================
@@ -0,0 +1,6 @@
+WARNING:
+  loadInterface
+  C
+  Call stack:
+      CallStack (from HasCallStack):
+        warnPprTrace, called at compiler/GHC/Iface/Load.hs:<line>:<column> in <package-id>:GHC.Iface.Load


=====================================
testsuite/tests/bytecode/T25090/T25090.stdout
=====================================
@@ -0,0 +1 @@
+2024


=====================================
testsuite/tests/bytecode/T25090/all.T
=====================================
@@ -0,0 +1,18 @@
+# This test compiles the boot file separately from its source file, which causes
+# a debug assertion warning.
+# Since this appears to be intentional according to the Note [Loading your own hi-boot file],
+# the warning is added to the expected stderr for debugged builds.
+def test_T25090(name):
+    assert_warn_spec = {'stderr': 'T25090-debug.stderr'}
+    extra_specs = assert_warn_spec if name == 'T25090a' and compiler_debugged() else {}
+    return test(name,
+     [extra_files(['A.hs', 'B.hs', 'C.hs-boot', 'C.hs', 'D.hs']),
+      req_th,
+      js_skip,
+      use_specs(dict(stdout = 'T25090.stdout', **extra_specs)),
+      ],
+     makefile_test,
+     [])
+
+test_T25090('T25090a')
+test_T25090('T25090b')



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/2bb1e8df8be1d31094b3160114a38a3e8d5ec963

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/2bb1e8df8be1d31094b3160114a38a3e8d5ec963
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20240927/e7672250/attachment-0001.html>


More information about the ghc-commits mailing list