[Git][ghc/ghc][wip/T23536-teo] Make template-haskell a stage1 package
Teo Camarasu (@teo)
gitlab at gitlab.haskell.org
Thu Apr 4 16:29:19 UTC 2024
Teo Camarasu pushed to branch wip/T23536-teo at Glasgow Haskell Compiler / GHC
Commits:
7c85b3d8 by Teo Camarasu at 2024-04-04T17:29:07+01:00
Make template-haskell a stage1 package
Promoting template-haskell from a stage0 to a stage1 package means that
we can much more easily refactor template-haskell.
In order to accomplish this we now compile stage0 packages using
the boot compiler's version of template-haskell.
This means that there are now two versions of template-haskell in play:
the boot compiler's version, and the in-tree version.
When compiling the stage1 compiler, we have to pick a version of
template-haskell to use.
During bootstrapping we want to use the same version as the final
compiler. This forces the in-tree version. We are only able to use the
internal interpreter with stage2 onwards. Yet, we could still use the
external interpreter.
The external interpreter runs splices in another process. Queries and
results are seralised. This reduces our compatibility requirements from
ABI compatibility with the internal interpreter to mere serialisation
compatibility. We may compile GHC against another library to what the
external interpreter is compiled against so long as it has exactly the
same serialisation of template-haskell types.
This opens up the strategy pursued in this patch.
When compiling the stage1 compiler we vendor the template-haskell and
ghc-boot-th libraries through ghc-boot and use these to define the Template
Haskell interface for the external interpreter. Note that at this point
we also have the template-haskell and ghc-boot-th packages in our
transitive dependency closure from the boot compiler, and some packages
like containers have dependencies on these to define Lift instances.
Then the external interpreter should be compiled against the regular
template-haskell library from the source tree. As this will have the
same serialised interface as what we vendor in ghc-boot, we can then
run splices.
GHC stage2 is compiled as normal as well against the template-haskell
library from the source tree.
See Note [Bootstrapping Template Haskell]
Resolves #23536
- - - - -
11 changed files:
- compiler/GHC/Tc/Gen/Splice.hs
- compiler/ghc.cabal.in
- hadrian/src/Rules/Dependencies.hs
- hadrian/src/Rules/ToolArgs.hs
- hadrian/src/Settings/Default.hs
- hadrian/src/Settings/Packages.hs
- libraries/ghc-boot/ghc-boot.cabal.in
- libraries/ghci/ghci.cabal.in
- libraries/template-haskell/Language/Haskell/TH/Syntax.hs
- libraries/template-haskell/vendored-filepath/System/FilePath/Posix.hs
- libraries/template-haskell/vendored-filepath/System/FilePath/Windows.hs
Changes:
=====================================
compiler/GHC/Tc/Gen/Splice.hs
=====================================
@@ -2916,3 +2916,108 @@ tcGetInterp = do
case hsc_interp hsc_env of
Nothing -> liftIO $ throwIO (InstallationError "Template haskell requires a target code interpreter")
Just i -> pure i
+
+-- Note [Bootstrapping Template Haskell]
+-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+--
+-- Template Haskell requires special attention when compiling GHC.
+-- The implementation of the Template Haskell set of features requires tight
+-- coupling between the compiler and the `template-haskell` library.
+-- This complicates the bootstrapping story as compatibility constraints are
+-- placed on the version of `template-haskell` used to compile GHC during a
+-- particular stage and the version bundled with it.
+--
+-- These constraints can be divided by the features they are used to implement,
+-- namely running splices either directly or via the external interpreter, and
+-- desugaring bracket syntax.
+--
+-- (C1) Executing splices within the compiler: In order to execute a splice
+-- within the compiler, we must be able to compile and load code built against
+-- the same version of the `template-haskell` library as the compiler. This
+-- is an ABI compatibility constraint between the `template-haskell` version of
+-- the compiler and the splice.
+-- (C2) Executing splices through the external interpreter: In order to execute
+-- a splice via the external interpreter, we serialise bytecode, run it with the
+-- external interpreter, and communicate back the result through a binary
+-- serialised interface. This is a binary serialisation compatibilty constraint
+-- between the `template-haskell` version of the compiler and the splice.
+-- (C3) Desugaring bracket syntax: Bracket syntax is desugared by referring to a
+-- special wired-in package whose package id is `template-haskell`. So for
+-- instance an expression `'Just` gets desugared to something of type
+-- `template-haskell:Language.Haskell.TH.Syntax.Name`. Importantly, while this
+-- identifier is wired-in, the identity of the `template-haskell` package is
+-- not. So for instance we can successfully use an expression like
+-- `'Just :: Name` while compiling the `template-haskell` package as long as its
+-- package id is set to `template-haskell` as `Name` will resolve the the local
+-- identifier in the package (and the LHS and RHS will align). On the other
+-- hand, if we don't set the special package id, the type of the expression will
+-- be `template-haskell:...Name` while the `Name` on the RHS will resolve to the
+-- local identifier and we will get a type error. So, bracket syntax assumes the
+-- presence of a particular API in the `template-haskell` package, but it allows
+--
+-- These constraints are ranked from strongest to weakest. They only apply if we
+-- want to support the particular feature associated with them.
+--
+-- The tricky case is what do to when building the bootstrapping (stage1) GHC.
+-- The stage2 GHC is simpler as it can use the in-tree `template-haskell`
+-- package built by the stage1 GHC.
+--
+-- We should note that we cannot feasibly use the internal interpreter with a
+-- stage1 GHC. This is because the stage1 GHC was compiled with the stage0 GHC,
+-- which we assume is a different version. In order to run a splice that too
+-- would need to be compiled with the stage0 GHC, and so would all its
+-- dependencies.
+-- This allows us to disregard (C1) for the stage1 case.
+--
+-- In the past, we used to build the stage1 GHC and all its dependencies against
+-- the in-tree `template-haskell` library. This meant that we sacrificed (C2)
+-- because they are likely not serialisation compatible. We could not sacrifice
+-- (C3) because dependencies of GHC (such as `containers` and
+-- `template-haskell`) used bracket syntax to define `Lift` instances. This
+-- meant that the interface assumed by the boot compiler to implement bracket
+-- desugaring could not be modified (not even through CPP as (C1) would
+-- constrain us in future stages where we do support the internal interpreter).
+-- Yet, bracket syntax did work and gave us splices that desugared to code that
+-- referenced the in-tree version of `template-haskell` not the one the boot
+-- compiler required. So they could never be run.
+--
+-- Our current strategy is to not build `template-haskell` as a stage0 package.
+-- All of GHCs dependencies depend on the boot compilers version, and produce
+-- runnable splices. How do we deal with the stage1 compiler's dependency on
+-- `template-haskell`? There are two options. (D1) depend on the boot
+-- compiler's version for stage1 and then depend on the in-tree one in stage2.
+-- This violates (C1) and (C2), so we wouldn't be able to run splices at all
+-- with the stage1 compiler. Additionally this would introduce quite a bit of
+-- CPP into the compiler and mean we would have to stub out much of the
+-- template-haskell machinery or have an unrunable compatibilty shim. Or (D2)
+-- depend on the in-tree version.
+--
+-- (D2) is what we implement, but it is complicated by the fact that it means we
+-- practically have two versions of `template-haskell` in the dependency graph
+-- of the stage1 compiler. To avoid this, we recall that we only need
+-- serliasation compatibility (as per (C2)), so we can avoid a direct dependency
+-- on the in-tree version by vendoring it. We choose to vendor it into the
+-- `ghc-boot` package as both `lib:ghc` and `ghci` require a dependency on the
+-- `template-haskell` interface as they define the two ends of the protocol.
+-- This allows us to still run splices through the external interpreter.
+--
+-- We should note a futher edge-case with this approach. When compiling our
+-- vendored `template-haskell` library, we run afoul of (C3). The library
+-- defines several `Name`s using bracket syntax. As this package doesn't claim
+-- to be the wired-in package but it defines its own `Name` type, we get a type
+-- discrepancy with the `Name` type from the boot compiler's `template-haskell`
+-- library. Most of these are only used to define `Lift` instances, so in the
+-- vendored case we simply hide them behind CPP. Yet, there is one distinct use
+-- of a `Name`. We have a `Name` for the constructors of the `Multiplicity`
+-- type, which are also used in the pretty-printing module. We construct these
+-- manulally instead. This allows us to completely avoid using bracket syntax
+-- for compiling the vendored `template-haskell` package.
+--
+-- To summarise, our current approach allows us to use the external interpreter
+-- to run splices and allows bracket syntax to be desugared correctly. In order
+-- to implement this we vendor the `template-haskell` library into `ghc-boot`
+-- and take special care to not use bracket syntax in those modules as that
+-- would incorrectly produce code that uses identifiers from the boot compiler's
+-- `template-haskell` library.
+--
+-- See #23536.
=====================================
compiler/ghc.cabal.in
=====================================
@@ -115,7 +115,6 @@ Library
containers >= 0.6.2.1 && < 0.8,
array >= 0.1 && < 0.6,
filepath >= 1 && < 1.6,
- template-haskell == 2.22.*,
hpc >= 0.6 && < 0.8,
transformers >= 0.5 && < 0.7,
exceptions == 0.10.*,
=====================================
hadrian/src/Rules/Dependencies.hs
=====================================
@@ -35,7 +35,10 @@ extra_dependencies =
where
th_internal = (templateHaskell, "Language.Haskell.TH.Lib.Internal")
- dep (p1, m1) (p2, m2) s = do
+ dep (p1, m1) (p2, m2) s =
+ -- We use the boot compiler's `template-haskell` library when building stage0,
+ -- so we don't need to register dependencies.
+ if isStage0 s then pure [] else do
let context = Context s p1 (error "extra_dependencies: way not set") (error "extra_dependencies: iplace not set")
ways <- interpretInContext context getLibraryWays
mapM (\way -> (,) <$> path s way p1 m1 <*> path s way p2 m2) (S.toList ways)
=====================================
hadrian/src/Rules/ToolArgs.hs
=====================================
@@ -158,7 +158,6 @@ toolTargets = [ binary
-- , ghc -- # depends on ghc library
-- , runGhc -- # depends on ghc library
, ghcBoot
- , ghcBootTh
, ghcPlatform
, ghcToolchain
, ghcToolchainBin
@@ -172,7 +171,6 @@ toolTargets = [ binary
, mtl
, parsec
, time
- , templateHaskell
, text
, transformers
, semaphoreCompat
=====================================
hadrian/src/Settings/Default.hs
=====================================
@@ -93,7 +93,6 @@ stage0Packages = do
, ghc
, runGhc
, ghcBoot
- , ghcBootTh
, ghcPlatform
, ghcHeap
, ghcToolchain
@@ -108,7 +107,6 @@ stage0Packages = do
, parsec
, semaphoreCompat
, time
- , templateHaskell
, text
, transformers
, unlit
@@ -143,6 +141,7 @@ stage1Packages = do
, deepseq
, exceptions
, ghc
+ , ghcBootTh
, ghcBignum
, ghcCompact
, ghcExperimental
@@ -156,6 +155,7 @@ stage1Packages = do
, pretty
, rts
, semaphoreCompat
+ , templateHaskell
, stm
, unlit
, xhtml
=====================================
hadrian/src/Settings/Packages.hs
=====================================
@@ -121,6 +121,10 @@ packageArgs = do
, builder (Cc CompileC) ? (not <$> flag CcLlvmBackend) ?
input "**/cbits/atomic.c" ? arg "-Wno-sync-nand" ]
+ -------------------------------- ghcBoot ------------------------------
+ , package ghcBoot ?
+ builder (Cabal Flags) ? (stage0 `cabalFlag` "bootstrap-th")
+
--------------------------------- ghci ---------------------------------
, package ghci ? mconcat
[
=====================================
libraries/ghc-boot/ghc-boot.cabal.in
=====================================
@@ -35,6 +35,15 @@ source-repository head
location: https://gitlab.haskell.org/ghc/ghc.git
subdir: libraries/ghc-boot
+Flag bootstrap-th
+ Description:
+ Enabled when building the stage1 compiler in order to vendor the in-tree
+ `template-haskell` library, while allowing dependencies to depend on the
+ boot `template-haskell` library.
+ See Note [Bootstrapping Template Haskell]
+ Default: False
+ Manual: True
+
Library
default-language: Haskell2010
other-extensions: DeriveGeneric, RankNTypes, ScopedTypeVariables
@@ -56,13 +65,6 @@ Library
GHC.UniqueSubdir
GHC.Version
- -- reexport modules from ghc-boot-th so that packages don't have to import
- -- both ghc-boot and ghc-boot-th. It makes the dependency graph easier to
- -- understand and to refactor.
- reexported-modules:
- GHC.LanguageExtensions.Type
- , GHC.ForeignSrcLang.Type
- , GHC.Lexeme
-- reexport platform modules from ghc-platform
reexported-modules:
@@ -81,7 +83,49 @@ Library
filepath >= 1.3 && < 1.6,
deepseq >= 1.4 && < 1.6,
ghc-platform >= 0.1,
- ghc-boot-th == @ProjectVersionMunged@
+ if flag(bootstrap-th)
+ cpp-options: -DBOOTSTRAP_TH
+ build-depends:
+ ghc-prim
+ , pretty
+ -- we vendor ghc-boot-th and template-haskell while bootstrapping TH.
+ -- This is to avoid having two copies of ghc-boot-th and template-haskell
+ -- in the build graph: one from the boot compiler and the in-tree one.
+ hs-source-dirs: . ../ghc-boot-th ../template-haskell ../template-haskell/vendored-filepath
+ exposed-modules:
+ GHC.LanguageExtensions.Type
+ , GHC.ForeignSrcLang.Type
+ , GHC.Lexeme
+ , Language.Haskell.TH
+ , Language.Haskell.TH.Lib
+ , Language.Haskell.TH.Ppr
+ , Language.Haskell.TH.PprLib
+ , Language.Haskell.TH.Quote
+ , Language.Haskell.TH.Syntax
+ , Language.Haskell.TH.LanguageExtensions
+ , Language.Haskell.TH.CodeDo
+ , Language.Haskell.TH.Lib.Internal
+
+ other-modules:
+ Language.Haskell.TH.Lib.Map
+ , System.FilePath
+ , System.FilePath.Posix
+ , System.FilePath.Windows
+ else
+ hs-source-dirs: .
+ build-depends:
+ ghc-boot-th == @ProjectVersionMunged@
+ , template-haskell == 2.22.0.0
+ -- reexport modules from ghc-boot-th and template-haskell so that packages
+ -- don't have to import all of ghc-boot, ghc-boot-th and template-haskell.
+ -- It makes the dependency graph easier to understand and to refactor
+ -- and reduces the amount of cabal flags we need to use for bootstrapping TH.
+ reexported-modules:
+ GHC.LanguageExtensions.Type
+ , GHC.ForeignSrcLang.Type
+ , GHC.Lexeme
+ , Language.Haskell.TH
+ , Language.Haskell.TH.Syntax
if !os(windows)
build-depends:
unix >= 2.7 && < 2.9
=====================================
libraries/ghci/ghci.cabal.in
=====================================
@@ -84,7 +84,6 @@ library
filepath >= 1.4 && < 1.6,
ghc-boot == @ProjectVersionMunged@,
ghc-heap == @ProjectVersionMunged@,
- template-haskell == 2.22.*,
transformers >= 0.5 && < 0.7
if !os(windows)
=====================================
libraries/template-haskell/Language/Haskell/TH/Syntax.hs
=====================================
@@ -34,49 +34,54 @@ module Language.Haskell.TH.Syntax
-- $infix
) where
-import qualified Data.Fixed as Fixed
+import Prelude
import Data.Data hiding (Fixity(..))
import Data.IORef
import System.IO.Unsafe ( unsafePerformIO )
import System.FilePath
import GHC.IO.Unsafe ( unsafeDupableInterleaveIO )
-import Control.Monad (liftM)
import Control.Monad.IO.Class (MonadIO (..))
import Control.Monad.Fix (MonadFix (..))
-import Control.Applicative (Applicative(..))
import Control.Exception (BlockedIndefinitelyOnMVar (..), catch, throwIO)
import Control.Exception.Base (FixIOException (..))
import Control.Concurrent.MVar (newEmptyMVar, readMVar, putMVar)
import System.IO ( hPutStrLn, stderr )
-import Data.Char ( isAlpha, isAlphaNum, isUpper, ord )
+import Data.Char ( isAlpha, isAlphaNum, isUpper )
import Data.Int
import Data.List.NonEmpty ( NonEmpty(..) )
-import Data.Void ( Void, absurd )
import Data.Word
import Data.Ratio
-import GHC.CString ( unpackCString# )
import GHC.Generics ( Generic )
-import GHC.Types ( Int(..), Word(..), Char(..), Double(..), Float(..),
- TYPE, RuntimeRep(..), Levity(..), Multiplicity (..) )
import qualified Data.Kind as Kind (Type)
-import GHC.Prim ( Int#, Word#, Char#, Double#, Float#, Addr# )
import GHC.Ptr ( Ptr, plusPtr )
import GHC.Lexeme ( startsVarSym, startsVarId )
import GHC.ForeignSrcLang.Type
import Language.Haskell.TH.LanguageExtensions
-import Numeric.Natural
import Prelude hiding (Applicative(..))
import Foreign.ForeignPtr
import Foreign.C.String
import Foreign.C.Types
+#ifdef BOOTSTRAP_TH
+import GHC.Types (TYPE, RuntimeRep(..), Levity(..))
+#else
+import Control.Monad (liftM)
+import Data.Char (ord)
+import qualified Data.Fixed as Fixed
+import GHC.Prim ( Int#, Word#, Char#, Double#, Float#, Addr# )
+import GHC.Types ( Int(..), Word(..), Char(..), Double(..), Float(..),
+ TYPE, RuntimeRep(..), Levity(..) )
+import GHC.CString ( unpackCString# )
+import GHC.ForeignPtr (ForeignPtr(..), ForeignPtrContents(..))
+import Data.Void ( Void, absurd )
+import Numeric.Natural
import Data.Array.Byte (ByteArray(..))
import GHC.Exts
( ByteArray#, unsafeFreezeByteArray#, copyAddrToByteArray#, newByteArray#
, isByteArrayPinned#, isTrue#, sizeofByteArray#, unsafeCoerce#, byteArrayContents#
, copyByteArray#, newPinnedByteArray#)
-import GHC.ForeignPtr (ForeignPtr(..), ForeignPtrContents(..))
import GHC.ST (ST(..), runST)
+#endif
-----------------------------------------------------
--
@@ -1018,6 +1023,8 @@ class Lift (t :: TYPE r) where
liftTyped :: Quote m => t -> Code m t
+-- See Note [Bootstrapping Template Haskell]
+#ifndef BOOTSTRAP_TH
-- If you add any instances here, consider updating test th/TH_Lift
instance Lift Integer where
liftTyped x = unsafeCodeCoerce (lift x)
@@ -1384,10 +1391,11 @@ rightName = 'Right
nonemptyName :: Name
nonemptyName = '(:|)
+#endif
oneName, manyName :: Name
-oneName = 'One
-manyName = 'Many
+oneName = mkNameG DataName "ghc-prim" "GHC.Types" "One"
+manyName = mkNameG DataName "ghc-prim" "GHC.Types" "Many"
-----------------------------------------------------
--
=====================================
libraries/template-haskell/vendored-filepath/System/FilePath/Posix.hs
=====================================
@@ -102,6 +102,7 @@ module System.FilePath.Posix
)
where
+import Prelude
import Data.Char(toLower, toUpper, isAsciiLower, isAsciiUpper)
import Data.Maybe(isJust)
import Data.List(stripPrefix, isSuffixOf)
=====================================
libraries/template-haskell/vendored-filepath/System/FilePath/Windows.hs
=====================================
@@ -102,6 +102,7 @@ module System.FilePath.Windows
)
where
+import Prelude
import Data.Char(toLower, toUpper, isAsciiLower, isAsciiUpper)
import Data.Maybe(isJust)
import Data.List(stripPrefix, isSuffixOf)
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/7c85b3d8be68c000c93698954a7930096f0a499d
--
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/7c85b3d8be68c000c93698954a7930096f0a499d
You're receiving this email because of your account on gitlab.haskell.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20240404/3f98772c/attachment-0001.html>
More information about the ghc-commits
mailing list