[Git][ghc/ghc][wip/T23536-teo] 2 commits: Make template-haskell a stage1 package

Teo Camarasu (@teo) gitlab at gitlab.haskell.org
Fri Apr 5 09:48:28 UTC 2024



Teo Camarasu pushed to branch wip/T23536-teo at Glasgow Haskell Compiler / GHC


Commits:
98729f73 by Teo Camarasu at 2024-04-05T10:48:17+01:00
Make template-haskell a stage1 package

Promoting template-haskell from a stage0 to a stage1 package means that
we can much more easily refactor template-haskell.

In order to accomplish this we now  compile stage0 packages using
the boot compiler's version of template-haskell.

This means that there are now two versions of template-haskell in play:
the boot compiler's version, and the in-tree version.
When compiling the stage1 compiler, we have to pick a version of
template-haskell to use.

During bootstrapping we want to use the same version as the final
compiler. This forces the in-tree version. We are only able to use the
internal interpreter with stage2 onwards. Yet, we could still use the
external interpreter.

The external interpreter runs splices in another process. Queries and
results are seralised. This reduces our compatibility requirements from
ABI compatibility with the internal interpreter to mere serialisation
compatibility. We may compile GHC against another library to what the
external interpreter is compiled against so long as it has exactly the
same serialisation of template-haskell types.

This opens up the strategy pursued in this patch.

When compiling the stage1 compiler we vendor the template-haskell and
ghc-boot-th libraries through ghc-boot and use these to define the Template
Haskell interface for the external interpreter. Note that at this point
we also have the template-haskell and ghc-boot-th packages in our
transitive dependency closure from the boot compiler, and some packages
like containers have dependencies on these to define Lift instances.

Then the external interpreter should be compiled against the regular
template-haskell library from the source tree. As this will have the
same serialised interface as what we vendor in ghc-boot, we can then
run splices.

GHC stage2 is compiled as normal as well against the template-haskell
library from the source tree.

See Note [Bootstrapping Template Haskell]

Resolves #23536

- - - - -
36b83a79 by Teo Camarasu at 2024-04-05T10:48:17+01:00
Remove th_hack

This is no longer necessary now that template-haskell is no longer a stage0 package

- - - - -


11 changed files:

- compiler/GHC/Tc/Gen/Splice.hs
- compiler/ghc.cabal.in
- hadrian/src/Rules/Dependencies.hs
- hadrian/src/Rules/ToolArgs.hs
- hadrian/src/Settings/Default.hs
- hadrian/src/Settings/Packages.hs
- libraries/ghc-boot/ghc-boot.cabal.in
- libraries/ghci/ghci.cabal.in
- libraries/template-haskell/Language/Haskell/TH/Syntax.hs
- libraries/template-haskell/vendored-filepath/System/FilePath/Posix.hs
- libraries/template-haskell/vendored-filepath/System/FilePath/Windows.hs


Changes:

=====================================
compiler/GHC/Tc/Gen/Splice.hs
=====================================
@@ -2916,3 +2916,108 @@ tcGetInterp = do
    case hsc_interp hsc_env of
       Nothing -> liftIO $ throwIO (InstallationError "Template haskell requires a target code interpreter")
       Just i  -> pure i
+
+-- Note [Bootstrapping Template Haskell]
+-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+--
+-- Template Haskell requires special attention when compiling GHC.
+-- The implementation of the Template Haskell set of features requires tight
+-- coupling between the compiler and the `template-haskell` library.
+-- This complicates the bootstrapping story as compatibility constraints are
+-- placed on the version of `template-haskell` used to compile GHC during a
+-- particular stage and the version bundled with it.
+--
+-- These constraints can be divided by the features they are used to implement,
+-- namely running splices either directly or via the external interpreter, and
+-- desugaring bracket syntax.
+--
+-- (C1) Executing splices within the compiler: In order to execute a splice
+-- within the compiler, we must be able to compile and load code built against
+-- the same version of the `template-haskell` library as the compiler. This
+-- is an ABI compatibility constraint between the `template-haskell` version of
+-- the compiler and the splice.
+-- (C2) Executing splices through the external interpreter: In order to execute
+-- a splice via the external interpreter, we serialise bytecode, run it with the
+-- external interpreter, and communicate back the result through a binary
+-- serialised interface. This is a binary serialisation compatibilty constraint
+-- between the `template-haskell` version of the compiler and the splice.
+-- (C3) Desugaring bracket syntax: Bracket syntax is desugared by referring to a
+-- special wired-in package whose package id is `template-haskell`. So for
+-- instance an expression `'Just` gets desugared to something of type
+-- `template-haskell:Language.Haskell.TH.Syntax.Name`. Importantly, while this
+-- identifier is wired-in, the identity of the `template-haskell` package is
+-- not. So for instance we can successfully use an expression like
+-- `'Just :: Name` while compiling the `template-haskell` package as long as its
+-- package id is set to `template-haskell` as `Name` will resolve the the local
+-- identifier in the package (and the LHS and RHS will align). On the other
+-- hand, if we don't set the special package id, the type of the expression will
+-- be `template-haskell:...Name` while the `Name` on the RHS will resolve to the
+-- local identifier and we will get a type error. So, bracket syntax assumes the
+-- presence of a particular API in the `template-haskell` package, but it allows
+--
+-- These constraints are ranked from strongest to weakest. They only apply if we
+-- want to support the particular feature associated with them.
+--
+-- The tricky case is what do to when building the bootstrapping (stage1) GHC.
+-- The stage2 GHC is simpler as it can use the in-tree `template-haskell`
+-- package built by the stage1 GHC.
+--
+-- We should note that we cannot feasibly use the internal interpreter with a
+-- stage1 GHC. This is because the stage1 GHC was compiled with the stage0 GHC,
+-- which we assume is a different version. In order to run a splice that too
+-- would need to be compiled with the stage0 GHC, and so would all its
+-- dependencies.
+-- This allows us to disregard (C1) for the stage1 case.
+--
+-- In the past, we used to build the stage1 GHC and all its dependencies against
+-- the in-tree `template-haskell` library. This meant that we sacrificed (C2)
+-- because they are likely not serialisation compatible. We could not sacrifice
+-- (C3) because dependencies of GHC (such as `containers` and
+-- `template-haskell`) used bracket syntax to define `Lift` instances. This
+-- meant that the interface assumed by the boot compiler to implement bracket
+-- desugaring could not be modified (not even through CPP as (C1) would
+-- constrain us in future stages where we do support the internal interpreter).
+-- Yet, bracket syntax did work and gave us splices that desugared to code that
+-- referenced the in-tree version of `template-haskell` not the one the boot
+-- compiler required. So they could never be run.
+--
+-- Our current strategy is to not build `template-haskell` as a stage0 package.
+-- All of GHCs dependencies depend on the boot compilers version, and produce
+-- runnable splices. How do we deal with the stage1 compiler's dependency on
+-- `template-haskell`? There are two options. (D1) depend on the boot
+-- compiler's version for stage1 and then depend on the in-tree one in stage2.
+-- This violates (C1) and (C2), so we wouldn't be able to run splices at all
+-- with the stage1 compiler. Additionally this would introduce quite a bit of
+-- CPP into the compiler and mean we would have to stub out much of the
+-- template-haskell machinery or have an unrunable compatibilty shim. Or (D2)
+-- depend on the in-tree version.
+--
+-- (D2) is what we implement, but it is complicated by the fact that it means we
+-- practically have two versions of `template-haskell` in the dependency graph
+-- of the stage1 compiler. To avoid this, we recall that we only need
+-- serliasation compatibility (as per (C2)), so we can avoid a direct dependency
+-- on the in-tree version by vendoring it. We choose to vendor it into the
+-- `ghc-boot` package as both `lib:ghc` and `ghci` require a dependency on the
+-- `template-haskell` interface as they define the two ends of the protocol.
+-- This allows us to still run splices through the external interpreter.
+--
+-- We should note a futher edge-case with this approach. When compiling our
+-- vendored `template-haskell` library, we run afoul of (C3). The library
+-- defines several `Name`s using bracket syntax. As this package doesn't claim
+-- to be the wired-in package but it defines its own `Name` type, we get a type
+-- discrepancy with the `Name` type from the boot compiler's `template-haskell`
+-- library. Most of these are only used to define `Lift` instances, so in the
+-- vendored case we simply hide them behind CPP. Yet, there is one distinct use
+-- of a `Name`. We have a `Name` for the constructors of the `Multiplicity`
+-- type, which are also used in the pretty-printing module. We construct these
+-- manulally instead. This allows us to completely avoid using bracket syntax
+-- for compiling the vendored `template-haskell` package.
+--
+-- To summarise, our current approach allows us to use the external interpreter
+-- to run splices and allows bracket syntax to be desugared correctly. In order
+-- to implement this we vendor the `template-haskell` library into `ghc-boot`
+-- and take special care to not use bracket syntax in those modules as that
+-- would incorrectly produce code that uses identifiers from the boot compiler's
+-- `template-haskell` library.
+--
+-- See #23536.


=====================================
compiler/ghc.cabal.in
=====================================
@@ -115,7 +115,6 @@ Library
                    containers >= 0.6.2.1 && < 0.8,
                    array      >= 0.1 && < 0.6,
                    filepath   >= 1   && < 1.6,
-                   template-haskell == 2.22.*,
                    hpc        >= 0.6 && < 0.8,
                    transformers >= 0.5 && < 0.7,
                    exceptions == 0.10.*,


=====================================
hadrian/src/Rules/Dependencies.hs
=====================================
@@ -35,7 +35,10 @@ extra_dependencies =
 
   where
     th_internal = (templateHaskell, "Language.Haskell.TH.Lib.Internal")
-    dep (p1, m1) (p2, m2) s = do
+    dep (p1, m1) (p2, m2) s =
+      -- We use the boot compiler's `template-haskell` library when building stage0,
+      -- so we don't need to register dependencies.
+      if isStage0 s then pure [] else do
         let context = Context s p1 (error "extra_dependencies: way not set") (error "extra_dependencies: iplace not set")
         ways <- interpretInContext context getLibraryWays
         mapM (\way -> (,) <$> path s way p1 m1 <*> path s way p2 m2) (S.toList ways)


=====================================
hadrian/src/Rules/ToolArgs.hs
=====================================
@@ -85,25 +85,13 @@ multiSetup pkg_s = do
       need (srcs ++ gens)
       let rexp m = ["-reexported-module", m]
       let hidir = root </> "interfaces" </> pkgPath p
-      writeFile' (resp_file root p) (intercalate "\n" (th_hack arg_list
+      writeFile' (resp_file root p) (intercalate "\n" (arg_list
                                                       ++  modules cd
                                                       ++ concatMap rexp (reexportModules cd)
                                                       ++ ["-outputdir", hidir]))
       return (resp_file root p)
 
 
-    -- The template-haskell package is compiled with -this-unit-id=template-haskell but
-    -- everything which depends on it depends on `-package-id-template-haskell-2.17.0.0`
-    -- and so the logic for detetecting which home-units depend on what is defeated.
-    -- The workaround here is just to rewrite all the `-package-id` arguments to
-    -- point to `template-haskell` instead which works for the multi-repl case.
-    -- See #20887
-    th_hack :: [String] -> [String]
-    th_hack ((isPrefixOf "-package-id template-haskell" -> True) : xs) = "-package-id" : "template-haskell" : xs
-    th_hack (x:xs) = x : th_hack xs
-    th_hack [] = []
-
-
 toolRuleBody :: FilePath -> Action ()
 toolRuleBody fp = do
   mm <- dirMap
@@ -158,7 +146,6 @@ toolTargets = [ binary
               -- , ghc     -- # depends on ghc library
               -- , runGhc  -- # depends on ghc library
               , ghcBoot
-              , ghcBootTh
               , ghcPlatform
               , ghcToolchain
               , ghcToolchainBin
@@ -172,7 +159,6 @@ toolTargets = [ binary
               , mtl
               , parsec
               , time
-              , templateHaskell
               , text
               , transformers
               , semaphoreCompat


=====================================
hadrian/src/Settings/Default.hs
=====================================
@@ -93,7 +93,6 @@ stage0Packages = do
              , ghc
              , runGhc
              , ghcBoot
-             , ghcBootTh
              , ghcPlatform
              , ghcHeap
              , ghcToolchain
@@ -108,7 +107,6 @@ stage0Packages = do
              , parsec
              , semaphoreCompat
              , time
-             , templateHaskell
              , text
              , transformers
              , unlit
@@ -143,6 +141,7 @@ stage1Packages = do
         , deepseq
         , exceptions
         , ghc
+        , ghcBootTh
         , ghcBignum
         , ghcCompact
         , ghcExperimental
@@ -156,6 +155,7 @@ stage1Packages = do
         , pretty
         , rts
         , semaphoreCompat
+        , templateHaskell
         , stm
         , unlit
         , xhtml


=====================================
hadrian/src/Settings/Packages.hs
=====================================
@@ -121,6 +121,10 @@ packageArgs = do
           , builder (Cc CompileC) ? (not <$> flag CcLlvmBackend) ?
             input "**/cbits/atomic.c"  ? arg "-Wno-sync-nand" ]
 
+        -------------------------------- ghcBoot ------------------------------
+        , package ghcBoot ?
+            builder (Cabal Flags) ? (stage0 `cabalFlag` "bootstrap-th")
+
         --------------------------------- ghci ---------------------------------
         , package ghci ? mconcat
           [


=====================================
libraries/ghc-boot/ghc-boot.cabal.in
=====================================
@@ -35,6 +35,15 @@ source-repository head
     location: https://gitlab.haskell.org/ghc/ghc.git
     subdir:   libraries/ghc-boot
 
+Flag bootstrap-th
+        Description:
+          Enabled when building the stage1 compiler in order to vendor the in-tree
+          `template-haskell` library, while allowing dependencies to depend on the
+          boot `template-haskell` library.
+          See Note [Bootstrapping Template Haskell]
+        Default: False
+        Manual: True
+
 Library
     default-language: Haskell2010
     other-extensions: DeriveGeneric, RankNTypes, ScopedTypeVariables
@@ -56,13 +65,6 @@ Library
             GHC.UniqueSubdir
             GHC.Version
 
-    -- reexport modules from ghc-boot-th so that packages don't have to import
-    -- both ghc-boot and ghc-boot-th. It makes the dependency graph easier to
-    -- understand and to refactor.
-    reexported-modules:
-              GHC.LanguageExtensions.Type
-            , GHC.ForeignSrcLang.Type
-            , GHC.Lexeme
 
     -- reexport platform modules from ghc-platform
     reexported-modules:
@@ -81,7 +83,49 @@ Library
                    filepath   >= 1.3 && < 1.6,
                    deepseq    >= 1.4 && < 1.6,
                    ghc-platform >= 0.1,
-                   ghc-boot-th == @ProjectVersionMunged@
+    if flag(bootstrap-th)
+      cpp-options: -DBOOTSTRAP_TH
+      build-depends:
+              ghc-prim
+              , pretty
+      -- we vendor ghc-boot-th and template-haskell while bootstrapping TH.
+      -- This is to avoid having two copies of ghc-boot-th and template-haskell
+      -- in the build graph: one from the boot compiler and the in-tree one.
+      hs-source-dirs: . ../ghc-boot-th ../template-haskell ../template-haskell/vendored-filepath
+      exposed-modules:
+              GHC.LanguageExtensions.Type
+            , GHC.ForeignSrcLang.Type
+            , GHC.Lexeme
+            , Language.Haskell.TH
+            , Language.Haskell.TH.Lib
+            , Language.Haskell.TH.Ppr
+            , Language.Haskell.TH.PprLib
+            , Language.Haskell.TH.Quote
+            , Language.Haskell.TH.Syntax
+            , Language.Haskell.TH.LanguageExtensions
+            , Language.Haskell.TH.CodeDo
+            , Language.Haskell.TH.Lib.Internal
+
+      other-modules:
+              Language.Haskell.TH.Lib.Map
+            , System.FilePath
+            , System.FilePath.Posix
+            , System.FilePath.Windows
+    else
+      hs-source-dirs: .
+      build-depends:
+              ghc-boot-th         == @ProjectVersionMunged@
+            , template-haskell  == 2.22.0.0
+      -- reexport modules from ghc-boot-th and template-haskell so that packages
+      -- don't have to import all of ghc-boot, ghc-boot-th and template-haskell.
+      -- It makes the dependency graph easier to understand and to refactor
+      -- and reduces the amount of cabal flags we need to use for bootstrapping TH.
+      reexported-modules:
+              GHC.LanguageExtensions.Type
+            , GHC.ForeignSrcLang.Type
+            , GHC.Lexeme
+            , Language.Haskell.TH
+            , Language.Haskell.TH.Syntax
     if !os(windows)
         build-depends:
                    unix       >= 2.7 && < 2.9


=====================================
libraries/ghci/ghci.cabal.in
=====================================
@@ -84,7 +84,6 @@ library
         filepath         >= 1.4 && < 1.6,
         ghc-boot         == @ProjectVersionMunged@,
         ghc-heap         == @ProjectVersionMunged@,
-        template-haskell == 2.22.*,
         transformers     >= 0.5 && < 0.7
 
     if !os(windows)


=====================================
libraries/template-haskell/Language/Haskell/TH/Syntax.hs
=====================================
@@ -34,49 +34,52 @@ module Language.Haskell.TH.Syntax
     -- $infix
     ) where
 
-import qualified Data.Fixed as Fixed
+import Prelude
 import Data.Data hiding (Fixity(..))
 import Data.IORef
 import System.IO.Unsafe ( unsafePerformIO )
 import System.FilePath
 import GHC.IO.Unsafe    ( unsafeDupableInterleaveIO )
-import Control.Monad (liftM)
 import Control.Monad.IO.Class (MonadIO (..))
 import Control.Monad.Fix (MonadFix (..))
-import Control.Applicative (Applicative(..))
 import Control.Exception (BlockedIndefinitelyOnMVar (..), catch, throwIO)
 import Control.Exception.Base (FixIOException (..))
 import Control.Concurrent.MVar (newEmptyMVar, readMVar, putMVar)
 import System.IO        ( hPutStrLn, stderr )
-import Data.Char        ( isAlpha, isAlphaNum, isUpper, ord )
+import Data.Char        ( isAlpha, isAlphaNum, isUpper )
 import Data.Int
 import Data.List.NonEmpty ( NonEmpty(..) )
-import Data.Void        ( Void, absurd )
 import Data.Word
 import Data.Ratio
-import GHC.CString      ( unpackCString# )
 import GHC.Generics     ( Generic )
-import GHC.Types        ( Int(..), Word(..), Char(..), Double(..), Float(..),
-                          TYPE, RuntimeRep(..), Levity(..), Multiplicity (..) )
 import qualified Data.Kind as Kind (Type)
-import GHC.Prim         ( Int#, Word#, Char#, Double#, Float#, Addr# )
 import GHC.Ptr          ( Ptr, plusPtr )
 import GHC.Lexeme       ( startsVarSym, startsVarId )
 import GHC.ForeignSrcLang.Type
 import Language.Haskell.TH.LanguageExtensions
-import Numeric.Natural
 import Prelude hiding (Applicative(..))
 import Foreign.ForeignPtr
 import Foreign.C.String
 import Foreign.C.Types
+import GHC.Types        (TYPE, RuntimeRep(..), Levity(..))
 
+#ifndef BOOTSTRAP_TH
+import Control.Monad (liftM)
+import Data.Char (ord)
+import qualified Data.Fixed as Fixed
+import GHC.Prim         ( Int#, Word#, Char#, Double#, Float#, Addr# )
+import GHC.Types        ( Int(..), Word(..), Char(..), Double(..), Float(..))
+import GHC.CString      ( unpackCString# )
+import GHC.ForeignPtr (ForeignPtr(..), ForeignPtrContents(..))
+import Data.Void        ( Void, absurd )
+import Numeric.Natural
 import Data.Array.Byte (ByteArray(..))
 import GHC.Exts
   ( ByteArray#, unsafeFreezeByteArray#, copyAddrToByteArray#, newByteArray#
   , isByteArrayPinned#, isTrue#, sizeofByteArray#, unsafeCoerce#, byteArrayContents#
   , copyByteArray#, newPinnedByteArray#)
-import GHC.ForeignPtr (ForeignPtr(..), ForeignPtrContents(..))
 import GHC.ST (ST(..), runST)
+#endif
 
 -----------------------------------------------------
 --
@@ -1018,6 +1021,8 @@ class Lift (t :: TYPE r) where
   liftTyped :: Quote m => t -> Code m t
 
 
+-- See Note [Bootstrapping Template Haskell]
+#ifndef BOOTSTRAP_TH
 -- If you add any instances here, consider updating test th/TH_Lift
 instance Lift Integer where
   liftTyped x = unsafeCodeCoerce (lift x)
@@ -1384,10 +1389,11 @@ rightName = 'Right
 
 nonemptyName :: Name
 nonemptyName = '(:|)
+#endif
 
 oneName, manyName :: Name
-oneName  = 'One
-manyName = 'Many
+oneName  = mkNameG DataName "ghc-prim" "GHC.Types" "One"
+manyName = mkNameG DataName "ghc-prim" "GHC.Types" "Many"
 
 -----------------------------------------------------
 --


=====================================
libraries/template-haskell/vendored-filepath/System/FilePath/Posix.hs
=====================================
@@ -102,6 +102,7 @@ module System.FilePath.Posix
     )
     where
 
+import Prelude
 import Data.Char(toLower, toUpper, isAsciiLower, isAsciiUpper)
 import Data.Maybe(isJust)
 import Data.List(stripPrefix, isSuffixOf)


=====================================
libraries/template-haskell/vendored-filepath/System/FilePath/Windows.hs
=====================================
@@ -102,6 +102,7 @@ module System.FilePath.Windows
     )
     where
 
+import Prelude
 import Data.Char(toLower, toUpper, isAsciiLower, isAsciiUpper)
 import Data.Maybe(isJust)
 import Data.List(stripPrefix, isSuffixOf)



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/d835af29aaf96bbb0682718b9bc24e04bab14081...36b83a79a4ee7654a2a5cdb03d26d13f5d219328

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/d835af29aaf96bbb0682718b9bc24e04bab14081...36b83a79a4ee7654a2a5cdb03d26d13f5d219328
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20240405/f42f6c0a/attachment-0001.html>


More information about the ghc-commits mailing list