[Git][ghc/ghc][wip/fendor/ifacetype-deduplication] 4 commits: Add deduplication table for `IfaceType`
Hannes Siebenhandl (@fendor)
gitlab at gitlab.haskell.org
Tue Apr 16 12:17:24 UTC 2024
Hannes Siebenhandl pushed to branch wip/fendor/ifacetype-deduplication at Glasgow Haskell Compiler / GHC
Commits:
fc204644 by Fendor at 2024-04-16T14:16:54+02:00
Add deduplication table for `IfaceType`
The type `IfaceType` is a highly redundant, tree-like data structure.
While benchmarking, we realised that the high redundancy of `IfaceType`
causes high memory consumption in GHCi sessions.
We fix this by adding a deduplication table to the serialisation of
`ModIface`, similar to how we deduplicate `Name`s and `FastString`s.
When reading the interface file back, the table allows us to automatically
share identical values of `IfaceType`.
This deduplication has the beneficial side effect to additionally reduce
the size of the on-disk interface files tremendously.
We also add IfaceType deduplication table to .hie serialisation and
refactor .hie file serialisation to use the same infrastrucutre as
`putWithTables`.
- - - - -
3116d68b by Matthew Pickering at 2024-04-16T14:16:54+02:00
Add run-time configurability of .hi file compression
Introduce the flag `-fwrite-if-compression=<n>` which allows to
configure the compression level of writing .hi files.
Reading .hi files doesn't need to know the initial compression level,
and can always deserialise a `ModIface`.
This allows users to experiment with different compression levels for
packages, without recompilation of dependencies.
We introduce three compression levels:
* `1`: `Normal` mode. This is the least amount of compression.
It deduplicates only `Name` and `FastString`s, and is naturally the
fastest compression mode.
* `2`: `Safe` mode. It has a noticeable impact on .hi file size and is
marginally slower than `Normal` mode. In general, it should be safe to
always use `Safe` mode.
* `3`: `Full` deduplication mode. Deduplicate as much as we can,
resulting in minimal .hi files, but at the cost of additional
compilation time.
Note, the deduplication also has an additional side effect of reduced
memory consumption to implicit sharing of deduplicated elements.
See https://gitlab.haskell.org/ghc/ghc/-/issues/24540 for example where
that matters.
-------------------------
Metric Decrease:
T21839c
T24471
-------------------------
- - - - -
ef207f0e by Matthew Pickering at 2024-04-16T14:16:54+02:00
Add some tests to check for size of interface files when serialising
various types
- - - - -
0823ceae by Fendor at 2024-04-16T14:16:54+02:00
Implement TrieMap for IfaceType
- - - - -
24 changed files:
- compiler/GHC/Core/Map/Expr.hs
- compiler/GHC/Data/TrieMap.hs
- compiler/GHC/Driver/DynFlags.hs
- compiler/GHC/Driver/Main.hs
- compiler/GHC/Driver/Session.hs
- compiler/GHC/Iface/Binary.hs
- compiler/GHC/Iface/Ext/Binary.hs
- compiler/GHC/Iface/Load.hs
- compiler/GHC/Iface/Recomp/Binary.hs
- compiler/GHC/Iface/Type.hs
- + compiler/GHC/Iface/Type/Map.hs
- compiler/GHC/Stg/CSE.hs
- compiler/GHC/StgToJS/Object.hs
- compiler/GHC/Utils/Binary.hs
- compiler/ghc.cabal.in
- docs/users_guide/using-optimisation.rst
- + testsuite/tests/iface/IfaceSharingIfaceType.hs
- + testsuite/tests/iface/IfaceSharingName.hs
- + testsuite/tests/iface/Lib.hs
- + testsuite/tests/iface/Makefile
- + testsuite/tests/iface/all.T
- + testsuite/tests/iface/if_faststring.hs
- + testsuite/tests/iface/if_ifacetype.hs
- + testsuite/tests/iface/if_name.hs
Changes:
=====================================
compiler/GHC/Core/Map/Expr.hs
=====================================
@@ -40,6 +40,7 @@ import GHC.Utils.Outputable
import qualified Data.Map as Map
import GHC.Types.Name.Env
import Control.Monad( (>=>) )
+import GHC.Types.Literal (Literal)
{-
This module implements TrieMaps over Core related data structures
@@ -128,6 +129,8 @@ instance TrieMap CoreMap where
-- inside another 'TrieMap', this is the type you want.
type CoreMapG = GenMap CoreMapX
+type LiteralMap = Map.Map Literal
+
-- | @CoreMapX a@ is the base map from @DeBruijn CoreExpr@ to @a@, but without
-- the 'GenMap' optimization.
data CoreMapX a
=====================================
compiler/GHC/Data/TrieMap.hs
=====================================
@@ -13,8 +13,6 @@ module GHC.Data.TrieMap(
MaybeMap,
-- * Maps over 'List' values
ListMap,
- -- * Maps over 'Literal's
- LiteralMap,
-- * 'TrieMap' class
TrieMap(..), insertTM, deleteTM, foldMapTM, isEmptyTM,
@@ -30,7 +28,6 @@ module GHC.Data.TrieMap(
import GHC.Prelude
-import GHC.Types.Literal
import GHC.Types.Unique.DFM
import GHC.Types.Unique( Uniquable )
@@ -39,6 +36,8 @@ import qualified Data.IntMap as IntMap
import GHC.Utils.Outputable
import Control.Monad( (>=>) )
import Data.Kind( Type )
+import Data.Functor.Compose
+import Data.Functor.Product
import qualified Data.Semigroup as S
@@ -343,15 +342,87 @@ ftList :: TrieMap m => (a -> Bool) -> ListMap m a -> ListMap m a
ftList f (LM { lm_nil = mnil, lm_cons = mcons })
= LM { lm_nil = filterMaybe f mnil, lm_cons = fmap (filterTM f) mcons }
-{-
-************************************************************************
-* *
- Basic maps
-* *
-************************************************************************
--}
+{- Composition -}
+
+instance (TrieMap m, TrieMap n) => TrieMap (Compose m n) where
+ type Key (Compose m n) = (Key m, Key n)
+ emptyTM = Compose emptyTM
+ lookupTM = lkCompose lookupTM lookupTM
+ {-# INLINE lookupTM #-}
+ alterTM = xtCompose alterTM alterTM
+ {-# INLINE alterTM #-}
+ foldTM = fdCompose
+ {-# INLINE foldTM #-}
+ filterTM = ftCompose
+ {-# INLINE filterTM #-}
+
+lkCompose :: Monad m => (t1 -> f (g a1) -> m a2) -> (t2 -> a2 -> m b) -> (t1, t2) -> Compose f g a1 -> m b
+lkCompose f g (a, b) (Compose m) = f a m >>= g b
+{-# INLINE lkCompose #-}
+
+xtCompose ::
+ (TrieMap m, TrieMap n)
+ => (forall a . Key m -> XT a -> m a -> m a)
+ -> (forall a . Key n -> XT a -> n a -> n a)
+ -> Key (Compose m n)
+ -> XT a
+ -> Compose m n a
+ -> Compose m n a
+
+xtCompose f g (a, b) xt (Compose m) = Compose ((f a |>> g b xt) m)
+
+{-# INLINE xtCompose #-}
+
+fdCompose :: (TrieMap m1, TrieMap m2) => (a -> b -> b) -> Compose m1 m2 a -> b -> b
+fdCompose f (Compose m) = foldTM (foldTM f) m
+
+{-# INLINE fdCompose #-}
+
+
+ftCompose :: (TrieMap n, Functor m) => (a -> Bool) -> Compose m n a -> Compose m n a
+ftCompose f (Compose m) = Compose (fmap (filterTM f) m)
+
+{-# INLINE ftCompose #-}
+
+{- Product -}
+instance (TrieMap m, TrieMap n) => TrieMap (Product m n) where
+ type Key (Product m n) = Either (Key m) (Key n)
+ emptyTM = Pair emptyTM emptyTM
+ lookupTM = lkProduct
+ {-# INLINE lookupTM #-}
+ alterTM = xtProduct
+ {-# INLINE alterTM #-}
+ foldTM = fdProduct
+ {-# INLINE foldTM #-}
+ filterTM = ftProduct
+ {-# INLINE filterTM #-}
+
+lkProduct :: (TrieMap m1, TrieMap m2) => Either (Key m1) (Key m2) -> Product m1 m2 b -> Maybe b
+lkProduct k (Pair am bm) =
+ case k of
+ Left a -> lookupTM a am
+ Right b -> lookupTM b bm
+
+{-# INLINE lkProduct #-}
+
+xtProduct :: (TrieMap f, TrieMap g) => Either (Key f) (Key g) -> XT a -> Product f g a -> Product f g a
+xtProduct k xt (Pair am bm) =
+ case k of
+ Left a -> Pair (alterTM a xt am) bm
+ Right b -> Pair am (alterTM b xt bm)
+
+{-# INLINE xtProduct #-}
+
+fdProduct :: (TrieMap f, TrieMap g) => (a -> c -> c) -> Product f g a -> c -> c
+fdProduct f (Pair am bm) = foldTM f am . foldTM f bm
+
+{-# INLINE fdProduct #-}
+
+ftProduct :: (TrieMap f, TrieMap g) => (a -> Bool) -> Product f g a -> Product f g a
+ftProduct f (Pair am bm) = Pair (filterTM f am) (filterTM f bm)
+
+{-# INLINE ftProduct #-}
-type LiteralMap a = Map.Map Literal a
{-
************************************************************************
=====================================
compiler/GHC/Driver/DynFlags.hs
=====================================
@@ -207,6 +207,7 @@ data DynFlags = DynFlags {
dmdUnboxWidth :: !Int, -- ^ Whether DmdAnal should optimistically put an
-- Unboxed demand on returned products with at most
-- this number of fields
+ ifCompression :: Int,
specConstrThreshold :: Maybe Int, -- ^ Threshold for SpecConstr
specConstrCount :: Maybe Int, -- ^ Max number of specialisations for any one function
specConstrRecursive :: Int, -- ^ Max number of specialisations for recursive types
@@ -546,6 +547,7 @@ defaultDynFlags mySettings =
maxPmCheckModels = 30,
simplTickFactor = 100,
dmdUnboxWidth = 3, -- Default: Assume an unboxed demand on function bodies returning a triple
+ ifCompression = 2, -- Default: Apply safe compressions
specConstrThreshold = Just 2000,
specConstrCount = Just 3,
specConstrRecursive = 3,
=====================================
compiler/GHC/Driver/Main.hs
=====================================
@@ -163,7 +163,7 @@ import GHC.JS.Syntax
import GHC.IfaceToCore ( typecheckIface, typecheckWholeCoreBindings )
-import GHC.Iface.Load ( ifaceStats, writeIface )
+import GHC.Iface.Load ( ifaceStats, writeIface, flagsToIfCompression )
import GHC.Iface.Make
import GHC.Iface.Recomp
import GHC.Iface.Tidy
@@ -612,7 +612,7 @@ extract_renamed_stuff mod_summary tc_result = do
-- enables the option which keeps the renamed source.
hieFile <- mkHieFile mod_summary tc_result (fromJust rn_info)
let out_file = ml_hie_file $ ms_location mod_summary
- liftIO $ writeHieFile out_file hieFile
+ liftIO $ writeHieFile (flagsToIfCompression dflags) out_file hieFile
liftIO $ putDumpFileMaybe logger Opt_D_dump_hie "HIE AST" FormatHaskell (ppr $ hie_asts hieFile)
-- Validate HIE files
@@ -1207,7 +1207,7 @@ hscMaybeWriteIface logger dflags is_simple iface old_iface mod_location = do
withTiming logger
(text "WriteIface"<+>brackets (text iface_name))
(const ())
- (writeIface logger profile iface_name iface)
+ (writeIface logger profile (flagsToIfCompression dflags) iface_name iface)
if (write_interface || force_write_interface) then do
=====================================
compiler/GHC/Driver/Session.hs
=====================================
@@ -1664,6 +1664,9 @@ dynamic_flags_deps = [
, make_ord_flag defFlag "fno-refinement-level-hole-fits"
(noArg (\d -> d { refLevelHoleFits = Nothing }))
+ , make_ord_flag defFlag "fwrite-if-compression"
+ (intSuffix (\n d -> d { ifCompression = n }))
+
, make_dep_flag defGhcFlag "fllvm-pass-vectors-in-regs"
(noArg id)
"vectors registers are now passed in registers by default."
=====================================
compiler/GHC/Iface/Binary.hs
=====================================
@@ -14,6 +14,7 @@ module GHC.Iface.Binary (
writeBinIface,
readBinIface,
readBinIfaceHeader,
+ CompressionIFace(..),
getSymtabName,
CheckHiWay(..),
TraceBinIFace(..),
@@ -25,6 +26,8 @@ module GHC.Iface.Binary (
putName,
putSymbolTable,
BinSymbolTable(..),
+ initWriteIfaceType, initReadIfaceTypeTable,
+ putAllTables,
) where
import GHC.Prelude
@@ -46,14 +49,19 @@ import GHC.Types.SrcLoc
import GHC.Platform
import GHC.Settings.Constants
import GHC.Utils.Fingerprint
+import GHC.Iface.Type (IfaceType(..), getIfaceType, putIfaceType, ifaceTypeSharedByte)
+import Control.Monad
import Data.Array
import Data.Array.IO
import Data.Array.Unsafe
import Data.Char
-import Data.Word
import Data.IORef
-import Control.Monad
+import Data.Map.Strict (Map)
+import Data.Word
+import System.IO.Unsafe
+import Data.Typeable (Typeable)
+
-- ---------------------------------------------------------------------------
-- Reading and writing binary interface files
@@ -66,6 +74,21 @@ data TraceBinIFace
= TraceBinIFace (SDoc -> IO ())
| QuietBinIFace
+data CompressionIFace
+ = NormalCompression
+ -- ^ Perform the normal compression operations,
+ -- such as deduplicating 'Name's and 'FastString's
+ | SafeExtraCompression
+ -- ^ Perform some extra compression steps that have minimal impact
+ -- on the run-time of 'ghc'.
+ --
+ -- This reduces the size of '.hi' files significantly in some cases
+ -- and reduces overall memory usage in certain scenarios.
+ | MaximalCompression
+ -- ^ Try to compress as much as possible.
+ --
+ -- Yields the smallest '.hi' files but at the cost of additional run-time.
+
-- | Read an interface file header, checking the magic number, version, and
-- way. Returns the hash of the source file and a BinHandle which points at the
-- start of the rest of the interface file data.
@@ -158,30 +181,42 @@ getWithUserData name_cache bh = do
-- Reading names has the side effect of adding them into the given NameCache.
getTables :: NameCache -> ReadBinHandle -> IO ReadBinHandle
getTables name_cache bh = do
+ bhRef <- newIORef (error "used too soon")
+ -- It is important this is passed to 'getTable'
+ ud <- unsafeInterleaveIO (readIORef bhRef)
+
fsReaderTable <- initFastStringReaderTable
nameReaderTable <- initNameReaderTable name_cache
-
-
- -- The order of these deserialisation matters!
- --
- -- See Note [Order of deduplication tables during iface binary serialisation] for details.
- fsTable <- Binary.forwardGet bh (getTable fsReaderTable bh)
- let
- fsReader = mkReaderFromTable fsReaderTable fsTable
- bhFs = addReaderToUserData (mkSomeBinaryReader fsReader) bh
-
- nameTable <- Binary.forwardGet bh (getTable nameReaderTable bhFs)
- let
- nameReader = mkReaderFromTable nameReaderTable nameTable
- bhName = addReaderToUserData (mkSomeBinaryReader nameReader) bhFs
-
- pure bhName
+ ifaceTypeReaderTable <- initReadIfaceTypeTable ud
+
+ let -- For any 'ReaderTable', we decode the table that is found at the location
+ -- the forward reference points to.
+ -- After decoding the table, we create a 'BinaryReader' and immediately
+ -- add it to the 'ReaderUserData' of 'ReadBinHandle'.
+ decodeReaderTable :: Typeable a => ReaderTable a -> ReadBinHandle -> IO ReadBinHandle
+ decodeReaderTable tbl bh0 = do
+ table <- Binary.forwardGet bh (getTable tbl bh0)
+ let binaryReader = mkReaderFromTable tbl table
+ pure $ addReaderToUserData binaryReader bh0
+
+ -- Decode all the tables and populate the 'ReaderUserData'.
+ bhFinal <- foldM (\bh0 act -> act bh0) bh
+ -- The order of these deserialisation matters!
+ --
+ -- See Note [Order of deduplication tables during iface binary serialisation] for details.
+ [ decodeReaderTable fsReaderTable
+ , decodeReaderTable nameReaderTable
+ , decodeReaderTable ifaceTypeReaderTable
+ ]
+
+ writeIORef bhRef (getReaderUserData bhFinal)
+ pure bhFinal
-- | Write an interface file.
--
-- See Note [Deduplication during iface binary serialisation] for details.
-writeBinIface :: Profile -> TraceBinIFace -> FilePath -> ModIface -> IO ()
-writeBinIface profile traceBinIface hi_path mod_iface = do
+writeBinIface :: Profile -> TraceBinIFace -> CompressionIFace -> FilePath -> ModIface -> IO ()
+writeBinIface profile traceBinIface compressionLevel hi_path mod_iface = do
bh <- openBinMem initBinMemSize
let platform = profilePlatform profile
put_ bh (binaryInterfaceMagic platform)
@@ -195,7 +230,7 @@ writeBinIface profile traceBinIface hi_path mod_iface = do
extFields_p_p <- tellBinWriter bh
put_ bh extFields_p_p
- putWithUserData traceBinIface bh mod_iface
+ putWithUserData traceBinIface compressionLevel bh mod_iface
extFields_p <- tellBinWriter bh
putAt bh extFields_p_p extFields_p
@@ -209,9 +244,9 @@ writeBinIface profile traceBinIface hi_path mod_iface = do
-- is necessary if you want to serialise Names or FastStrings.
-- It also writes a symbol table and the dictionary.
-- This segment should be read using `getWithUserData`.
-putWithUserData :: Binary a => TraceBinIFace -> WriteBinHandle -> a -> IO ()
-putWithUserData traceBinIface bh payload = do
- (name_count, fs_count, _b) <- putWithTables bh (\bh' -> put bh' payload)
+putWithUserData :: Binary a => TraceBinIFace -> CompressionIFace -> WriteBinHandle -> a -> IO ()
+putWithUserData traceBinIface compressionLevel bh payload = do
+ (name_count, fs_count, _b) <- putWithTables compressionLevel bh (\bh' -> put bh' payload)
case traceBinIface of
QuietBinIFace -> return ()
@@ -234,17 +269,19 @@ putWithUserData traceBinIface bh payload = do
-- It returns (number of names, number of FastStrings, payload write result)
--
-- See Note [Order of deduplication tables during iface binary serialisation]
-putWithTables :: WriteBinHandle -> (WriteBinHandle -> IO b) -> IO (Int, Int, b)
-putWithTables bh' put_payload = do
+putWithTables :: CompressionIFace -> WriteBinHandle -> (WriteBinHandle -> IO b) -> IO (Int, Int, b)
+putWithTables compressionLevel bh' put_payload = do
-- Initialise deduplicating tables.
(fast_wt, fsWriter) <- initFastStringWriterTable
(name_wt, nameWriter) <- initNameWriterTable
+ (ifaceType_wt, ifaceTypeWriter) <- initWriteIfaceType compressionLevel
-- Initialise the 'WriterUserData'.
let writerUserData = mkWriterUserData
[ mkSomeBinaryWriter @FastString fsWriter
, mkSomeBinaryWriter @Name nameWriter
, mkSomeBinaryWriter @BindingName $ mkWriter (\bh name -> putEntry nameWriter bh (getBindingName name))
+ , mkSomeBinaryWriter @IfaceType ifaceTypeWriter
]
let bh = setWriterUserData bh' writerUserData
@@ -252,18 +289,24 @@ putWithTables bh' put_payload = do
-- The order of these entries matters!
--
-- See Note [Order of deduplication tables during iface binary serialisation] for details.
- putAllTables bh [fast_wt, name_wt] $ do
+ putAllTables bh [fast_wt, name_wt, ifaceType_wt] $ do
put_payload bh
return (name_count, fs_count, r)
- where
- putAllTables _ [] act = do
- a <- act
- pure ([], a)
- putAllTables bh (x : xs) act = do
- (r, (res, a)) <- forwardPut bh (const $ putTable x bh) $ do
- putAllTables bh xs act
- pure (r : res, a)
+
+-- | Write all deduplication tables to disk after serialising the
+-- main payload.
+--
+-- Writes forward pointers to the deduplication tables before writing the payload
+-- to allow deserialisation *before* the payload is read again.
+putAllTables :: WriteBinHandle -> [WriterTable] -> IO b -> IO ([Int], b)
+putAllTables _ [] act = do
+ a <- act
+ pure ([], a)
+putAllTables bh (x : xs) act = do
+ (r, (res, a)) <- forwardPut bh (const $ putTable x bh) $ do
+ putAllTables bh xs act
+ pure (r : res, a)
-- | Initial ram buffer to allocate for writing interface files
initBinMemSize :: Int
@@ -429,6 +472,42 @@ Here, a visualisation of the table structure we currently have (ignoring 'Extens
-- The symbol table
--
+initReadIfaceTypeTable :: ReaderUserData -> IO (ReaderTable IfaceType)
+initReadIfaceTypeTable ud = do
+ pure $
+ ReaderTable
+ { getTable = getGenericSymbolTable (\bh -> getIfaceType (setReaderUserData bh ud))
+ , mkReaderFromTable = \tbl -> mkReader (getGenericSymtab tbl)
+ }
+
+initWriteIfaceType :: CompressionIFace -> IO (WriterTable, BinaryWriter IfaceType)
+initWriteIfaceType compressionLevel = do
+ sym_tab <- initGenericSymbolTable @(Map IfaceType)
+ pure
+ ( WriterTable
+ { putTable = putGenericSymbolTable sym_tab (lazyPut' putIfaceType)
+ }
+ , mkWriter $ ifaceWriter sym_tab
+ )
+ where
+ ifaceWriter sym_tab = case compressionLevel of
+ NormalCompression -> literalIfaceTypeSerialiser
+ SafeExtraCompression -> ifaceTyConAppSerialiser sym_tab
+ MaximalCompression -> fullIfaceTypeSerialiser sym_tab
+
+ ifaceTyConAppSerialiser sym_tab bh ty = case ty of
+ IfaceTyConApp {} -> do
+ put_ bh ifaceTypeSharedByte
+ putGenericSymTab sym_tab bh ty
+ _ -> putIfaceType bh ty
+
+
+ fullIfaceTypeSerialiser sym_tab bh ty = do
+ put_ bh ifaceTypeSharedByte
+ putGenericSymTab sym_tab bh ty
+
+ literalIfaceTypeSerialiser = putIfaceType
+
initNameReaderTable :: NameCache -> IO (ReaderTable Name)
initNameReaderTable cache = do
=====================================
compiler/GHC/Iface/Ext/Binary.hs
=====================================
@@ -38,22 +38,21 @@ import Data.ByteString ( ByteString )
import qualified Data.ByteString as BS
import qualified Data.ByteString.Char8 as BSC
import Data.Word ( Word8, Word32 )
-import Control.Monad ( replicateM, when, forM_ )
+import Control.Monad ( replicateM, when, forM_, foldM )
import System.Directory ( createDirectoryIfMissing )
import System.FilePath ( takeDirectory )
import GHC.Iface.Ext.Types
+import GHC.Iface.Binary (initWriteIfaceType, putAllTables, initReadIfaceTypeTable, CompressionIFace)
+import GHC.Iface.Type (IfaceType)
+import System.IO.Unsafe (unsafeInterleaveIO)
+import qualified GHC.Utils.Binary as Binary
data HieSymbolTable = HieSymbolTable
{ hie_symtab_next :: !FastMutInt
, hie_symtab_map :: !(IORef (UniqFM Name (Int, HieName)))
}
-data HieDictionary = HieDictionary
- { hie_dict_next :: !FastMutInt -- The next index to use
- , hie_dict_map :: !(IORef (UniqFM FastString (Int,FastString))) -- indexed by FastString
- }
-
initBinMemSize :: Int
initBinMemSize = 1024*1024
@@ -74,8 +73,8 @@ putBinLine bh xs = do
-- | Write a `HieFile` to the given `FilePath`, with a proper header and
-- symbol tables for `Name`s and `FastString`s
-writeHieFile :: FilePath -> HieFile -> IO ()
-writeHieFile hie_file_path hiefile = do
+writeHieFile :: CompressionIFace -> FilePath -> HieFile -> IO ()
+writeHieFile compression hie_file_path hiefile = do
bh0 <- openBinMem initBinMemSize
-- Write the header: hieHeader followed by the
@@ -84,58 +83,58 @@ writeHieFile hie_file_path hiefile = do
putBinLine bh0 $ BSC.pack $ show hieVersion
putBinLine bh0 $ ghcVersion
- -- remember where the dictionary pointer will go
- dict_p_p <- tellBinWriter bh0
- put_ bh0 dict_p_p
+ (fs_tbl, fs_w) <- initFastStringWriterTable
+ (name_tbl, name_w) <- initWriteNameTable
+ (iface_tbl, iface_w) <- initWriteIfaceType compression
- -- remember where the symbol table pointer will go
- symtab_p_p <- tellBinWriter bh0
- put_ bh0 symtab_p_p
+ let bh = setWriterUserData bh0 $ mkWriterUserData
+ [ mkSomeBinaryWriter @IfaceType iface_w
+ , mkSomeBinaryWriter @Name name_w
+ , mkSomeBinaryWriter @BindingName (simpleBindingNameWriter name_w)
+ , mkSomeBinaryWriter @FastString fs_w
+ ]
- -- Make some initial state
- symtab_next <- newFastMutInt 0
- symtab_map <- newIORef emptyUFM :: IO (IORef (UniqFM Name (Int, HieName)))
- let hie_symtab = HieSymbolTable {
- hie_symtab_next = symtab_next,
- hie_symtab_map = symtab_map }
- dict_next_ref <- newFastMutInt 0
- dict_map_ref <- newIORef emptyUFM
- let hie_dict = HieDictionary {
- hie_dict_next = dict_next_ref,
- hie_dict_map = dict_map_ref }
-
- -- put the main thing
- let bh = setWriterUserData bh0
- $ newWriteState (putName hie_symtab)
- (putName hie_symtab)
- (putFastString hie_dict)
- put_ bh hiefile
-
- -- write the symtab pointer at the front of the file
- symtab_p <- tellBinWriter bh
- putAt bh symtab_p_p symtab_p
- seekBinWriter bh symtab_p
-
- -- write the symbol table itself
- symtab_next' <- readFastMutInt symtab_next
- symtab_map' <- readIORef symtab_map
- putSymbolTable bh symtab_next' symtab_map'
-
- -- write the dictionary pointer at the front of the file
- dict_p <- tellBinWriter bh
- putAt bh dict_p_p dict_p
- seekBinWriter bh dict_p
-
- -- write the dictionary itself
- dict_next <- readFastMutInt dict_next_ref
- dict_map <- readIORef dict_map_ref
- putDictionary bh dict_next dict_map
+ -- Discard number of written elements
+ -- Order matters! See Note [Order of deduplication tables during iface binary serialisation]
+ _ <- putAllTables bh [fs_tbl, name_tbl, iface_tbl] $ do
+ put_ bh hiefile
-- and send the result to the file
createDirectoryIfMissing True (takeDirectory hie_file_path)
writeBinMem bh hie_file_path
return ()
+initWriteNameTable :: IO (WriterTable, BinaryWriter Name)
+initWriteNameTable = do
+ symtab_next <- newFastMutInt 0
+ symtab_map <- newIORef emptyUFM
+ let bin_symtab =
+ HieSymbolTable
+ { hie_symtab_next = symtab_next
+ , hie_symtab_map = symtab_map
+ }
+
+ let put_symtab bh = do
+ name_count <- readFastMutInt symtab_next
+ symtab_map <- readIORef symtab_map
+ putSymbolTable bh name_count symtab_map
+ pure name_count
+
+ return
+ ( WriterTable
+ { putTable = put_symtab
+ }
+ , mkWriter $ putName bin_symtab
+ )
+
+initReadNameTable :: NameCache -> IO (ReaderTable Name)
+initReadNameTable cache = do
+ return $
+ ReaderTable
+ { getTable = \bh -> getSymbolTable bh cache
+ , mkReaderFromTable = \tbl -> mkReader (getSymTabName tbl)
+ }
+
data HieFileResult
= HieFileResult
{ hie_file_result_version :: Integer
@@ -216,50 +215,32 @@ readHieFileHeader file bh0 = do
readHieFileContents :: ReadBinHandle -> NameCache -> IO HieFile
readHieFileContents bh0 name_cache = do
- dict <- get_dictionary bh0
+ bhRef <- newIORef (error "used too soon")
+ -- It is important this is passed to 'getTable'
+ ud <- unsafeInterleaveIO (readIORef bhRef)
+
+ fsReaderTable <- initFastStringReaderTable
+ nameReaderTable <- initReadNameTable name_cache
+ ifaceTypeReaderTable <- initReadIfaceTypeTable ud
+
-- read the symbol table so we are capable of reading the actual data
- bh1 <- do
- let bh1 = setReaderUserData bh0
- $ newReadState (error "getSymtabName")
- (getDictFastString dict)
- symtab <- get_symbol_table bh1
- let bh1' = setReaderUserData bh1
- $ newReadState (getSymTabName symtab)
- (getDictFastString dict)
- return bh1'
+ bh1 <-
+ foldM (\bh tblReader -> tblReader bh) bh0
+ [ get_dictionary fsReaderTable
+ , get_dictionary nameReaderTable
+ , get_dictionary ifaceTypeReaderTable
+ ]
-- load the actual data
get bh1
where
- get_dictionary bin_handle = do
- dict_p <- get bin_handle
- data_p <- tellBinReader bin_handle
- seekBinReader bin_handle dict_p
- dict <- getDictionary bin_handle
- seekBinReader bin_handle data_p
- return dict
-
- get_symbol_table bh1 = do
- symtab_p <- get bh1
- data_p' <- tellBinReader bh1
- seekBinReader bh1 symtab_p
- symtab <- getSymbolTable bh1 name_cache
- seekBinReader bh1 data_p'
- return symtab
-
-putFastString :: HieDictionary -> WriteBinHandle -> FastString -> IO ()
-putFastString HieDictionary { hie_dict_next = j_r,
- hie_dict_map = out_r} bh f
- = do
- out <- readIORef out_r
- let !unique = getUnique f
- case lookupUFM_Directly out unique of
- Just (j, _) -> put_ bh (fromIntegral j :: Word32)
- Nothing -> do
- j <- readFastMutInt j_r
- put_ bh (fromIntegral j :: Word32)
- writeFastMutInt j_r (j + 1)
- writeIORef out_r $! addToUFM_Directly out unique (j, f)
+ get_dictionary tbl bin_handle = do
+ fsTable <- Binary.forwardGet bin_handle (getTable tbl bin_handle)
+ let
+ fsReader = mkReaderFromTable tbl fsTable
+ bhFs = addReaderToUserData fsReader bin_handle
+ pure bhFs
+
putSymbolTable :: WriteBinHandle -> Int -> UniqFM Name (Int,HieName) -> IO ()
putSymbolTable bh next_off symtab = do
=====================================
compiler/GHC/Iface/Load.hs
=====================================
@@ -26,6 +26,7 @@ module GHC.Iface.Load (
loadInterface,
loadSysInterface, loadUserInterface, loadPluginInterface,
findAndReadIface, readIface, writeIface,
+ flagsToIfCompression,
moduleFreeHolesPrecise,
needWiredInHomeIface, loadWiredInHomeIface,
@@ -965,11 +966,18 @@ read_file logger name_cache unit_state dflags wanted_mod file_path = do
-- | Write interface file
-writeIface :: Logger -> Profile -> FilePath -> ModIface -> IO ()
-writeIface logger profile hi_file_path new_iface
+writeIface :: Logger -> Profile -> CompressionIFace -> FilePath -> ModIface -> IO ()
+writeIface logger profile compression_level hi_file_path new_iface
= do createDirectoryIfMissing True (takeDirectory hi_file_path)
let printer = TraceBinIFace (debugTraceMsg logger 3)
- writeBinIface profile printer hi_file_path new_iface
+ writeBinIface profile printer compression_level hi_file_path new_iface
+
+flagsToIfCompression :: DynFlags -> CompressionIFace
+flagsToIfCompression dflags = case ifCompression dflags of
+ 0 -> NormalCompression
+ 1 -> NormalCompression
+ 2 -> SafeExtraCompression
+ _ -> MaximalCompression
-- | @readIface@ tries just the one file.
--
=====================================
compiler/GHC/Iface/Recomp/Binary.hs
=====================================
@@ -14,6 +14,7 @@ import GHC.Utils.Fingerprint
import GHC.Utils.Binary
import GHC.Types.Name
import GHC.Utils.Panic.Plain
+import GHC.Iface.Type (putIfaceType)
fingerprintBinMem :: WriteBinHandle -> IO Fingerprint
fingerprintBinMem bh = withBinBuffer bh f
@@ -34,8 +35,12 @@ computeFingerprint put_nonbinding_name a = do
put_ bh a
fingerprintBinMem bh
where
- set_user_data bh =
- setWriterUserData bh $ newWriteState put_nonbinding_name putNameLiterally putFS
+ set_user_data bh = setWriterUserData bh $ mkWriterUserData
+ [ mkSomeBinaryWriter $ mkWriter putIfaceType
+ , mkSomeBinaryWriter $ mkWriter put_nonbinding_name
+ , mkSomeBinaryWriter $ simpleBindingNameWriter $ mkWriter putNameLiterally
+ , mkSomeBinaryWriter $ mkWriter putFS
+ ]
-- | Used when we want to fingerprint a structure without depending on the
-- fingerprints of external Names that it refers to.
=====================================
compiler/GHC/Iface/Type.hs
=====================================
@@ -33,6 +33,8 @@ module GHC.Iface.Type (
ifForAllBndrVar, ifForAllBndrName, ifaceBndrName,
ifTyConBinderVar, ifTyConBinderName,
+ -- Binary utilities
+ putIfaceType, getIfaceType, ifaceTypeSharedByte,
-- Equality testing
isIfaceLiftedTypeKind,
@@ -90,11 +92,13 @@ import GHC.Utils.Misc
import GHC.Utils.Panic
import {-# SOURCE #-} GHC.Tc.Utils.TcType ( isMetaTyVar, isTyConableTyVar )
+import Data.Maybe (isJust)
+import Data.Proxy
+import qualified Data.Semigroup as Semi
+import Data.Word (Word8)
+import Control.Arrow (first)
import Control.DeepSeq
import Control.Monad ((<$!>))
-import Control.Arrow (first)
-import qualified Data.Semigroup as Semi
-import Data.Maybe( isJust )
{-
************************************************************************
@@ -109,6 +113,10 @@ newtype IfLclName = IfLclName
{ getIfLclName :: LexicalFastString
} deriving (Eq, Ord, Show)
+instance Uniquable IfLclName where
+ getUnique = getUnique . ifLclNameFS
+
+
ifLclNameFS :: IfLclName -> FastString
ifLclNameFS = getLexicalFastString . getIfLclName
@@ -2191,38 +2199,70 @@ ppr_parend_preds :: [IfacePredType] -> SDoc
ppr_parend_preds preds = parens (fsep (punctuate comma (map ppr preds)))
instance Binary IfaceType where
- put_ _ (IfaceFreeTyVar tv)
- = pprPanic "Can't serialise IfaceFreeTyVar" (ppr tv)
+ put_ bh ty =
+ case findUserDataWriter Proxy bh of
+ tbl -> putEntry tbl bh ty
- put_ bh (IfaceForAllTy aa ab) = do
- putByte bh 0
- put_ bh aa
- put_ bh ab
- put_ bh (IfaceTyVar ad) = do
- putByte bh 1
- put_ bh ad
- put_ bh (IfaceAppTy ae af) = do
- putByte bh 2
- put_ bh ae
- put_ bh af
- put_ bh (IfaceFunTy af aw ag ah) = do
- putByte bh 3
- put_ bh af
- put_ bh aw
- put_ bh ag
- put_ bh ah
- put_ bh (IfaceTyConApp tc tys)
- = do { putByte bh 5; put_ bh tc; put_ bh tys }
- put_ bh (IfaceCastTy a b)
- = do { putByte bh 6; put_ bh a; put_ bh b }
- put_ bh (IfaceCoercionTy a)
- = do { putByte bh 7; put_ bh a }
- put_ bh (IfaceTupleTy s i tys)
- = do { putByte bh 8; put_ bh s; put_ bh i; put_ bh tys }
- put_ bh (IfaceLitTy n)
- = do { putByte bh 9; put_ bh n }
+ get bh = getIfaceTypeShared bh
- get bh = do
+-- | This is the byte tag we expect to read when the next
+-- value is not an 'IfaceType' value, but an offset into a
+-- lookup value.
+--
+-- Must not overlap with any byte tag in 'getIfaceType'.
+ifaceTypeSharedByte :: Word8
+ifaceTypeSharedByte = 99
+
+-- | Like 'getIfaceType' but checks for a specific byte tag
+-- that indicates that we won't be able to read a 'IfaceType' value
+-- but rather an offset into a lookup table. Consequentially,
+-- we look up the value for the 'IfaceType' in the look up table.
+--
+-- See Note [Deduplication during iface binary serialisation]
+-- for details.
+getIfaceTypeShared :: ReadBinHandle -> IO IfaceType
+getIfaceTypeShared bh = do
+ start <- tellBinReader bh
+ tag <- getByte bh
+ if ifaceTypeSharedByte == tag
+ then case findUserDataReader Proxy bh of
+ tbl -> getEntry tbl bh
+ else seekBinReader bh start >> getIfaceType bh
+
+putIfaceType :: WriteBinHandle -> IfaceType -> IO ()
+putIfaceType _ (IfaceFreeTyVar tv)
+ = pprPanic "Can't serialise IfaceFreeTyVar" (ppr tv)
+
+putIfaceType bh (IfaceForAllTy aa ab) = do
+ putByte bh 0
+ put_ bh aa
+ put_ bh ab
+putIfaceType bh (IfaceTyVar ad) = do
+ putByte bh 1
+ put_ bh ad
+putIfaceType bh (IfaceAppTy ae af) = do
+ putByte bh 2
+ put_ bh ae
+ put_ bh af
+putIfaceType bh (IfaceFunTy af aw ag ah) = do
+ putByte bh 3
+ put_ bh af
+ put_ bh aw
+ put_ bh ag
+ put_ bh ah
+putIfaceType bh (IfaceTyConApp tc tys)
+ = do { putByte bh 5; put_ bh tc; put_ bh tys }
+putIfaceType bh (IfaceCastTy a b)
+ = do { putByte bh 6; put_ bh a; put_ bh b }
+putIfaceType bh (IfaceCoercionTy a)
+ = do { putByte bh 7; put_ bh a }
+putIfaceType bh (IfaceTupleTy s i tys)
+ = do { putByte bh 8; put_ bh s; put_ bh i; put_ bh tys }
+putIfaceType bh (IfaceLitTy n)
+ = do { putByte bh 9; put_ bh n }
+
+getIfaceType :: HasCallStack => ReadBinHandle -> IO IfaceType
+getIfaceType bh = do
h <- getByte bh
case h of
0 -> do aa <- get bh
=====================================
compiler/GHC/Iface/Type/Map.hs
=====================================
@@ -0,0 +1,180 @@
+{-# LANGUAGE TypeFamilies #-}
+module GHC.Iface.Type.Map where
+
+import GHC.Prelude
+import GHC.Data.TrieMap
+import GHC.Iface.Type
+import qualified Data.Map as Map
+import Data.Functor.Compose
+import GHC.Types.Basic
+import Control.Monad ((>=>))
+import GHC.Types.Unique.DFM
+import Data.Functor.Product
+import GHC.Types.Var (VarBndr(..))
+
+
+newtype IfaceTypeMap a = IfaceTypeMap (IfaceTypeMapG a)
+
+instance Functor IfaceTypeMap where
+ fmap f (IfaceTypeMap m) = IfaceTypeMap (fmap f m)
+
+instance TrieMap IfaceTypeMap where
+ type Key IfaceTypeMap = IfaceType
+
+ emptyTM = IfaceTypeMap emptyTM
+
+ lookupTM k (IfaceTypeMap m) = lookupTM k m
+
+ alterTM k f (IfaceTypeMap m) = IfaceTypeMap (alterTM k f m)
+
+ filterTM f (IfaceTypeMap m) = IfaceTypeMap (filterTM f m)
+
+ foldTM f (IfaceTypeMap m) = foldTM f m
+
+type IfaceTypeMapG = GenMap IfaceTypeMapX
+
+data IfaceTypeMapX a
+ = IFM { ifm_lit :: IfaceLiteralMap a
+ , ifm_var :: UniqDFM IfLclName a
+ , ifm_app :: IfaceTypeMapG (IfaceAppArgsMap a)
+ , ifm_fun_ty :: FunTyFlagMap (IfaceTypeMapG (IfaceTypeMapG (IfaceTypeMapG a)))
+ , ifm_ty_con_app :: IfaceTyConMap (IfaceAppArgsMap a)
+ , ifm_forall_ty :: IfaceForAllBndrMap (IfaceTypeMapG a)
+ , ifm_cast_ty :: IfaceTypeMapG (IfaceCoercionMap a)
+ , ifm_coercion_ty :: IfaceCoercionMap a
+ , ifm_tuple_ty :: TupleSortMap (PromotionFlagMap (IfaceAppArgsMap a)) }
+
+type IfaceLiteralMap = Map.Map IfaceTyLit
+type FunTyFlagMap = Map.Map FunTyFlag
+type IfaceTyConMap = Map.Map IfaceTyCon
+type ForAllTyFlagMap = Map.Map ForAllTyFlag
+type IfaceCoercionMap = Map.Map IfaceCoercion
+type TupleSortMap = Map.Map TupleSort
+type PromotionFlagMap = Map.Map PromotionFlag
+type IfaceForAllBndrMap = Compose IfaceBndrMap ForAllTyFlagMap
+
+type IfaceIdBndrMap = Compose IfaceTypeMapG (Compose (UniqDFM IfLclName) IfaceTypeMapG)
+type IfaceTvBndrMap = Compose (UniqDFM IfLclName) IfaceTypeMapG
+
+type IfaceBndrMap = Product IfaceIdBndrMap IfaceTvBndrMap
+
+
+
+
+type IfaceAppArgsMap a = ListMap (Compose IfaceTypeMapG ForAllTyFlagMap) a
+
+emptyE :: IfaceTypeMapX a
+emptyE = IFM { ifm_lit = emptyTM
+ , ifm_var = emptyTM
+ , ifm_app = emptyTM
+ , ifm_fun_ty = emptyTM
+ , ifm_ty_con_app = emptyTM
+ , ifm_forall_ty = emptyTM
+ , ifm_cast_ty = emptyTM
+ , ifm_coercion_ty = emptyTM
+ , ifm_tuple_ty = emptyTM }
+
+instance Functor IfaceTypeMapX where
+ fmap f IFM { ifm_lit = ilit
+ , ifm_var = ivar
+ , ifm_app = iapp
+ , ifm_fun_ty = ift
+ , ifm_ty_con_app = itc
+ , ifm_forall_ty = ifal
+ , ifm_cast_ty = icast
+ , ifm_coercion_ty = ico
+ , ifm_tuple_ty = itup }
+
+ = IFM { ifm_lit = fmap f ilit
+ , ifm_var = fmap f ivar
+ , ifm_app = fmap (fmap f) iapp
+ , ifm_fun_ty = fmap (fmap (fmap (fmap f))) ift
+ , ifm_ty_con_app = fmap (fmap f) itc
+ , ifm_forall_ty = fmap (fmap f) ifal
+ , ifm_cast_ty = fmap (fmap f) icast
+ , ifm_coercion_ty = fmap f ico
+ , ifm_tuple_ty = fmap (fmap (fmap f)) itup }
+
+instance TrieMap IfaceTypeMapX where
+ type Key IfaceTypeMapX = IfaceType
+
+ emptyTM = emptyE
+ lookupTM = lkE
+ alterTM = xtE
+ foldTM = fdE
+ filterTM = ftE
+ {-# INLINE lookupTM #-}
+ {-# INLINE alterTM #-}
+
+{-# INLINE ftE #-}
+ftE :: (a -> Bool) -> IfaceTypeMapX a -> IfaceTypeMapX a
+ftE f IFM { ifm_lit = ilit
+ , ifm_var = ivar
+ , ifm_app = iapp
+ , ifm_fun_ty = ift
+ , ifm_ty_con_app = itc
+ , ifm_forall_ty = ifal
+ , ifm_cast_ty = icast
+ , ifm_coercion_ty = ico
+ , ifm_tuple_ty = itup }
+
+ = IFM { ifm_lit = filterTM f ilit
+ , ifm_var = filterTM f ivar
+ , ifm_app = fmap (filterTM f) iapp
+ , ifm_fun_ty = fmap (fmap (fmap (filterTM f))) ift
+ , ifm_ty_con_app = fmap (filterTM f) itc
+ , ifm_forall_ty = fmap (filterTM f) ifal
+ , ifm_cast_ty = fmap (filterTM f) icast
+ , ifm_coercion_ty = filterTM f ico
+ , ifm_tuple_ty = fmap (fmap (filterTM f)) itup }
+
+{-# INLINE fdE #-}
+fdE :: (a -> b -> b) -> IfaceTypeMapX a -> b -> b
+fdE f IFM { ifm_lit = ilit
+ , ifm_var = ivar
+ , ifm_app = iapp
+ , ifm_fun_ty = ift
+ , ifm_ty_con_app = itc
+ , ifm_forall_ty = ifal
+ , ifm_cast_ty = icast
+ , ifm_coercion_ty = ico
+ , ifm_tuple_ty = itup }
+ = foldTM f ilit . foldTM f ivar . foldTM (foldTM f) iapp
+ . foldTM (foldTM (foldTM (foldTM f))) ift
+ . foldTM (foldTM f) itc
+ . foldTM (foldTM f) ifal
+ . foldTM (foldTM f) icast
+ . foldTM f ico
+ . foldTM (foldTM (foldTM f)) itup
+
+bndrToKey :: IfaceBndr -> Either (IfaceType, (IfLclName, IfaceType)) IfaceTvBndr
+bndrToKey (IfaceIdBndr (a,b,c)) = Left (a, (b,c))
+bndrToKey (IfaceTvBndr k) = Right k
+
+{-# INLINE lkE #-}
+lkE :: IfaceType -> IfaceTypeMapX a -> Maybe a
+lkE it ifm = go it ifm
+ where
+ go (IfaceFreeTyVar {}) = error "ftv"
+ go (IfaceTyVar var) = ifm_var >.> lookupTM var
+ go (IfaceLitTy l) = ifm_lit >.> lookupTM l
+ go (IfaceAppTy ift args) = ifm_app >.> lkG ift >=> lookupTM (appArgsIfaceTypesForAllTyFlags args)
+ go (IfaceFunTy ft t1 t2 t3) = ifm_fun_ty >.> lookupTM ft >=> lkG t1 >=> lkG t2 >=> lkG t3
+ go (IfaceForAllTy (Bndr a b) t) = ifm_forall_ty >.> lookupTM (bndrToKey a,b) >=> lkG t
+ go (IfaceTyConApp tc args) = ifm_ty_con_app >.> lookupTM tc >=> lookupTM (appArgsIfaceTypesForAllTyFlags args)
+ go (IfaceCastTy ty co) = ifm_cast_ty >.> lkG ty >=> lookupTM co
+ go (IfaceCoercionTy co) = ifm_coercion_ty >.> lookupTM co
+ go (IfaceTupleTy sort prom args) = ifm_tuple_ty >.> lookupTM sort >=> lookupTM prom >=> lookupTM (appArgsIfaceTypesForAllTyFlags args)
+
+{-# INLINE xtE #-}
+xtE :: IfaceType -> XT a -> IfaceTypeMapX a -> IfaceTypeMapX a
+xtE (IfaceFreeTyVar {}) _ _ = error "ftv"
+xtE (IfaceTyVar var) f m = m { ifm_var = ifm_var m |> alterTM var f }
+xtE (IfaceLitTy l) f m = m { ifm_lit = ifm_lit m |> alterTM l f }
+xtE (IfaceAppTy ift args) f m = m { ifm_app = ifm_app m |> xtG ift |>> alterTM (appArgsIfaceTypesForAllTyFlags args) f }
+xtE (IfaceFunTy ft t1 t2 t3) f m = m { ifm_fun_ty = ifm_fun_ty m |> alterTM ft |>> xtG t1 |>> xtG t2 |>> xtG t3 f }
+xtE (IfaceForAllTy (Bndr a b) t) f m = m { ifm_forall_ty = ifm_forall_ty m |> alterTM (bndrToKey a,b) |>> xtG t f }
+xtE (IfaceTyConApp tc args) f m = m { ifm_ty_con_app = ifm_ty_con_app m |> alterTM tc |>> alterTM (appArgsIfaceTypesForAllTyFlags args) f }
+xtE (IfaceCastTy ty co) f m = m { ifm_cast_ty = ifm_cast_ty m |> xtG ty |>> alterTM co f }
+xtE (IfaceCoercionTy co) f m = m { ifm_coercion_ty = ifm_coercion_ty m |> alterTM co f }
+xtE (IfaceTupleTy sort prom args) f m = m { ifm_tuple_ty = ifm_tuple_ty m |> alterTM sort |>> alterTM prom |>> alterTM (appArgsIfaceTypesForAllTyFlags args) f }
=====================================
compiler/GHC/Stg/CSE.hs
=====================================
@@ -109,6 +109,8 @@ import GHC.Core.Map.Expr
import GHC.Data.TrieMap
import GHC.Types.Name.Env
import Control.Monad( (>=>) )
+import qualified Data.Map as Map
+import GHC.Types.Literal ( Literal )
--------------
-- The Trie --
@@ -122,6 +124,8 @@ data StgArgMap a = SAM
, sam_lit :: LiteralMap a
}
+type LiteralMap = Map.Map Literal
+
-- TODO(22292): derive
instance Functor StgArgMap where
fmap f SAM { sam_var = varm, sam_lit = litm } = SAM
=====================================
compiler/GHC/StgToJS/Object.hs
=====================================
@@ -314,7 +314,7 @@ putObject bh mod_name deps os = do
put_ bh (moduleNameString mod_name)
(fs_tbl, fs_writer) <- initFastStringWriterTable
- let bh_fs = addWriterToUserData (mkSomeBinaryWriter fs_writer) bh
+ let bh_fs = addWriterToUserData fs_writer bh
forwardPut_ bh (const (putTable fs_tbl bh_fs)) $ do
put_ bh_fs deps
=====================================
compiler/GHC/Utils/Binary.hs
=====================================
@@ -65,6 +65,8 @@ module GHC.Utils.Binary
-- * Lazy Binary I/O
lazyGet,
lazyPut,
+ lazyGet',
+ lazyPut',
lazyGetMaybe,
lazyPutMaybe,
@@ -87,10 +89,17 @@ module GHC.Utils.Binary
initFastStringReaderTable, initFastStringWriterTable,
putDictionary, getDictionary, putFS,
FSTable(..), getDictFastString, putDictFastString,
+ -- * Generic deduplication table
+ GenericSymbolTable(..),
+ initGenericSymbolTable,
+ getGenericSymtab, putGenericSymTab,
+ getGenericSymbolTable, putGenericSymbolTable,
-- * Newtype wrappers
BinSpan(..), BinSrcSpan(..), BinLocated(..),
-- * Newtypes for types that have canonically more than one valid encoding
BindingName(..),
+ simpleBindingNameWriter,
+ simpleBindingNameReader,
) where
import GHC.Prelude
@@ -103,11 +112,11 @@ import GHC.Utils.Panic.Plain
import GHC.Types.Unique.FM
import GHC.Data.FastMutInt
import GHC.Utils.Fingerprint
-import GHC.Utils.Misc (HasCallStack)
import GHC.Types.SrcLoc
import GHC.Types.Unique
import qualified GHC.Data.Strict as Strict
import GHC.Utils.Outputable( JoinPointHood(..) )
+import GHC.Utils.Misc ( HasCallStack )
import Control.DeepSeq
import Control.Monad ( when, (<$!>), unless, forM_, void )
@@ -141,6 +150,9 @@ import GHC.ForeignPtr ( unsafeWithForeignPtr )
import Unsafe.Coerce (unsafeCoerce)
import Data.Coerce
+import GHC.Data.TrieMap
+
+
type BinArray = ForeignPtr Word8
@@ -230,22 +242,26 @@ setReaderUserData bh us = bh { rbm_userData = us }
-- | Add 'SomeBinaryReader' as a known binary decoder.
-- If a 'BinaryReader' for the associated type already exists in 'ReaderUserData',
-- it is overwritten.
-addReaderToUserData :: SomeBinaryReader -> ReadBinHandle -> ReadBinHandle
-addReaderToUserData cache@(SomeBinaryReader typRep _) bh = bh
+addReaderToUserData :: Typeable a => BinaryReader a -> ReadBinHandle -> ReadBinHandle
+addReaderToUserData reader bh = bh
{ rbm_userData = (rbm_userData bh)
{ ud_reader_data = Map.insert typRep cache (ud_reader_data (rbm_userData bh))
}
}
+ where
+ cache@(SomeBinaryReader typRep _) = mkSomeBinaryReader reader
-- | Add 'SomeBinaryWriter' as a known binary encoder.
-- If a 'BinaryWriter' for the associated type already exists in 'WriterUserData',
-- it is overwritten.
-addWriterToUserData :: SomeBinaryWriter -> WriteBinHandle -> WriteBinHandle
-addWriterToUserData cache@(SomeBinaryWriter typRep _) bh = bh
+addWriterToUserData :: Typeable a => BinaryWriter a -> WriteBinHandle -> WriteBinHandle
+addWriterToUserData writer bh = bh
{ wbm_userData = (wbm_userData bh)
{ ud_writer_data = Map.insert typRep cache (ud_writer_data (wbm_userData bh))
}
}
+ where
+ cache@(SomeBinaryWriter typRep _) = mkSomeBinaryWriter writer
-- | Get access to the underlying buffer.
withBinBuffer :: WriteBinHandle -> (ByteString -> IO a) -> IO a
@@ -1102,24 +1118,35 @@ forwardGet bh get_A = do
-- Lazy reading/writing
lazyPut :: Binary a => WriteBinHandle -> a -> IO ()
-lazyPut bh a = do
+lazyPut = lazyPut' put_
+
+lazyGet :: Binary a => ReadBinHandle -> IO a
+lazyGet = lazyGet' Nothing (\_ -> get)
+
+lazyPut' :: HasCallStack => (WriteBinHandle -> a -> IO ()) -> WriteBinHandle -> a -> IO ()
+lazyPut' f bh a = do
-- output the obj with a ptr to skip over it:
pre_a <- tellBinWriter bh
put_ bh pre_a -- save a slot for the ptr
- put_ bh a -- dump the object
+ f bh a -- dump the object
q <- tellBinWriter bh -- q = ptr to after object
putAt bh pre_a q -- fill in slot before a with ptr to q
seekBinWriter bh q -- finally carry on writing at q
-lazyGet :: Binary a => ReadBinHandle -> IO a
-lazyGet bh = do
+lazyGet' :: HasCallStack => Maybe (IORef ReadBinHandle) -> (Bin () -> ReadBinHandle -> IO a) -> ReadBinHandle -> IO a
+lazyGet' mbh f bh = do
p <- get bh -- a BinPtr
p_a <- tellBinReader bh
+ -- Do this before to avoid retaining reference to old BH inside the unsafeInterleaveIO.
+ let !get_inner_bh = maybe (pure bh) readIORef mbh
a <- unsafeInterleaveIO $ do
- -- NB: Use a fresh off_r variable in the child thread, for thread
+ -- NB: Use a fresh rbm_off_r variable in the child thread, for thread
-- safety.
+ inner_bh <- get_inner_bh
off_r <- newFastMutInt 0
- getAt bh { rbm_off_r = off_r } p_a
+ let bh' = inner_bh { rbm_off_r = off_r }
+ seekBinReader bh' p_a
+ f p bh'
seekBinReader bh p -- skip over the object for now
return a
@@ -1173,6 +1200,12 @@ lazyGetMaybe bh = do
newtype BindingName = BindingName { getBindingName :: Name }
deriving ( Eq )
+simpleBindingNameWriter :: BinaryWriter Name -> BinaryWriter BindingName
+simpleBindingNameWriter = coerce
+
+simpleBindingNameReader :: BinaryReader Name -> BinaryReader BindingName
+simpleBindingNameReader = coerce
+
-- | Existential for 'BinaryWriter' with a type witness.
data SomeBinaryWriter = forall a . SomeBinaryWriter TypeRep (BinaryWriter a)
@@ -1313,6 +1346,113 @@ data WriterTable = WriterTable
{ putTable :: WriteBinHandle -> IO Int
}
+-- ----------------------------------------------------------------------------
+-- Common data structures for constructing and maintaining lookup tables for
+-- binary serialisation and deserialisation.
+-- ----------------------------------------------------------------------------
+
+-- | The 'GenericSymbolTable' stores a mapping from already seen elements to an index.
+-- If an element wasn't seen before, it is added to the mapping together with a fresh
+-- index.
+--
+-- 'GenericSymbolTable' is a variant of a 'BinSymbolTable' that is polymorphic in the table implementation.
+-- As such it can be used with any container that implements the 'TrieMap' type class.
+--
+-- While 'GenericSymbolTable' is similar to the 'BinSymbolTable', it supports storing tree-like
+-- structures such as 'Type' and 'IfaceType' more efficiently.
+--
+data GenericSymbolTable m = GenericSymbolTable
+ { gen_symtab_next :: !FastMutInt
+ -- ^ The next index to use.
+ , gen_symtab_map :: !(IORef (m Int))
+ -- ^ Given a symbol, find the symbol and return its index.
+ , gen_symtab_to_write :: !(IORef [Key m])
+ -- ^ Reversed list of values to write into the buffer.
+ -- This is an optimisation, as it allows us to write out quickly all
+ -- newly discovered values that are discovered when serialising 'Key m'
+ -- to disk.
+ }
+
+-- | Initialise a 'GenericSymbolTable', initialising the index to '0'.
+initGenericSymbolTable :: TrieMap m => IO (GenericSymbolTable m)
+initGenericSymbolTable = do
+ symtab_next <- newFastMutInt 0
+ symtab_map <- newIORef emptyTM
+ symtab_todo <- newIORef []
+ pure $ GenericSymbolTable
+ { gen_symtab_next = symtab_next
+ , gen_symtab_map = symtab_map
+ , gen_symtab_to_write = symtab_todo
+ }
+
+-- | Serialise the 'GenericSymbolTable' to disk.
+--
+-- Since 'GenericSymbolTable' stores tree-like structures, such as 'IfaceType',
+-- serialising an element can add new elements to the mapping.
+-- Thus, 'putGenericSymbolTable' first serialises all values, and then checks whether any
+-- new elements have been discovered. If so, repeat the loop.
+putGenericSymbolTable :: forall m. (TrieMap m) => GenericSymbolTable m -> (WriteBinHandle -> Key m -> IO ()) -> WriteBinHandle -> IO Int
+{-# INLINE putGenericSymbolTable #-}
+putGenericSymbolTable gen_sym_tab serialiser bh = do
+ putGenericSymbolTable bh
+ where
+ symtab_next = gen_symtab_next gen_sym_tab
+ symtab_to_write = gen_symtab_to_write gen_sym_tab
+ putGenericSymbolTable :: WriteBinHandle -> IO Int
+ putGenericSymbolTable bh = do
+ let loop = do
+ vs <- atomicModifyIORef' symtab_to_write (\a -> ([], a))
+ case vs of
+ [] -> readFastMutInt symtab_next
+ todo -> do
+ mapM_ (\n -> serialiser bh n) (reverse todo)
+ loop
+ snd <$>
+ (forwardPut bh (const $ readFastMutInt symtab_next >>= put_ bh) $
+ loop)
+
+-- | Read the elements of a 'GenericSymbolTable' from disk into a 'SymbolTable'.
+getGenericSymbolTable :: forall a . (ReadBinHandle -> IO a) -> ReadBinHandle -> IO (SymbolTable a)
+getGenericSymbolTable deserialiser bh = do
+ sz <- forwardGet bh (get bh) :: IO Int
+ mut_arr <- newArray_ (0, sz-1) :: IO (IOArray Int a)
+ -- Using lazyPut/lazyGet is quite space inefficient as each usage will allocate a large closure
+ -- (6 arguments-ish).
+ forM_ [0..(sz-1)] $ \i -> do
+ f <- lazyGet' Nothing (\_ -> deserialiser) bh
+ -- f <- deserialiser bh
+ writeArray mut_arr i f
+ unsafeFreeze mut_arr
+
+-- | Write an element 'Key m' to the given 'WriteBinHandle'.
+--
+-- If the element was seen before, we simply write the index of that element to the
+-- 'WriteBinHandle'. If we haven't seen it before, we add the element to
+-- the 'GenericSymbolTable', increment the index, and return this new index.
+putGenericSymTab :: (TrieMap m) => GenericSymbolTable m -> WriteBinHandle -> Key m -> IO ()
+{-# INLINE putGenericSymTab #-}
+putGenericSymTab GenericSymbolTable{
+ gen_symtab_map = symtab_map_ref,
+ gen_symtab_next = symtab_next,
+ gen_symtab_to_write = symtab_todo }
+ bh val = do
+ symtab_map <- readIORef symtab_map_ref
+ case lookupTM val symtab_map of
+ Just off -> put_ bh (fromIntegral off :: Word32)
+ Nothing -> do
+ off <- readFastMutInt symtab_next
+ writeFastMutInt symtab_next (off+1)
+ writeIORef symtab_map_ref
+ $! insertTM val off symtab_map
+ atomicModifyIORef symtab_todo (\todo -> (val : todo, ()))
+ put_ bh (fromIntegral off :: Word32)
+
+-- | Read a value from a 'SymbolTable'.
+getGenericSymtab :: Binary a => SymbolTable a -> ReadBinHandle -> IO a
+getGenericSymtab symtab bh = do
+ i :: Word32 <- get bh
+ return $! symtab ! fromIntegral i
+
---------------------------------------------------------
-- The Dictionary
---------------------------------------------------------
=====================================
compiler/ghc.cabal.in
=====================================
@@ -579,6 +579,7 @@ Library
GHC.Iface.Tidy.StaticPtrTable
GHC.IfaceToCore
GHC.Iface.Type
+ GHC.Iface.Type.Map
GHC.JS.Ident
GHC.JS.Make
GHC.JS.Optimizer
=====================================
docs/users_guide/using-optimisation.rst
=====================================
@@ -1777,3 +1777,12 @@ as such you shouldn't need to set any of them explicitly. A flag
This flag sets the size (in bytes) threshold above which the second approach
is used. You can disable the second approach entirely by setting the
threshold to 0.
+
+.. ghc-flag:: -fwrite-if-compression=⟨n⟩
+ :shortdesc: *default: 2.* Tweak the level of interface file compression.
+ :type: dynamic
+ :category: optimization
+
+ :default: 2
+
+ TODO
=====================================
testsuite/tests/iface/IfaceSharingIfaceType.hs
=====================================
@@ -0,0 +1,44 @@
+{-# LANGUAGE FlexibleContexts #-}
+{-# LANGUAGE OverloadedStrings #-}
+
+module IfaceSharingIfaceType (types) where
+
+import GHC.Data.FastString
+import GHC.Builtin.Uniques
+import GHC.Builtin.Names
+import GHC.Builtin.Types
+import GHC.Types.Name
+import GHC.Types.SrcLoc
+import GHC.Iface.Type
+import GHC.CoreToIface
+import GHC.Core.TyCo.Rep
+import GHC
+
+[f1,f2,f3,f4,f5] = map mkVarOcc ["a", "b","c","d","e"]
+
+[u1,u2,u3,u4,u5] = map mkPreludeMiscIdUnique [10000..10004]
+
+names = [ mkExternalName u1 pRELUDE f1 noSrcSpan
+ , mkExternalName u2 pRELUDE f2 noSrcSpan
+ , mkExternalName u3 pRELUDE f3 noSrcSpan
+ , mkExternalName u4 pRELUDE f4 noSrcSpan
+ , mkExternalName u5 pRELUDE f5 noSrcSpan ]
+
+-- Int
+intIfaceTy = toIfaceType intTy
+
+wordIfaceTy = toIfaceType wordTy
+
+listIntTy = toIfaceType (mkListTy intTy)
+
+funTy = (intTy `mkInvisFunTy` wordTy `mkInvisFunTy` mkListTy intTy)
+
+funIfaceTy = toIfaceType funTy
+
+reallyBigFunTy = toIfaceType (funTy `mkInvisFunTy` funTy `mkInvisFunTy` funTy `mkInvisFunTy` funTy)
+
+forallIfaceTy = toIfaceType (dataConType justDataCon)
+
+
+types = [intIfaceTy, wordIfaceTy, listIntTy, funIfaceTy, reallyBigFunTy, forallIfaceTy]
+
=====================================
testsuite/tests/iface/IfaceSharingName.hs
=====================================
@@ -0,0 +1,21 @@
+{-# LANGUAGE FlexibleContexts #-}
+{-# LANGUAGE OverloadedStrings #-}
+
+module IfaceSharingName where
+
+import Lib
+import GHC.Data.FastString
+import GHC.Builtin.Uniques
+import GHC.Builtin.Names
+import GHC.Types.Name
+import GHC.Types.SrcLoc
+
+[f1,f2,f3,f4,f5] = map mkVarOcc ["a", "b","c","d","e"]
+
+[u1,u2,u3,u4,u5] = map mkPreludeMiscIdUnique [10000..10004]
+
+names = [ mkExternalName u1 pRELUDE f1 noSrcSpan
+ , mkExternalName u2 pRELUDE f2 noSrcSpan
+ , mkExternalName u3 pRELUDE f3 noSrcSpan
+ , mkExternalName u4 pRELUDE f4 noSrcSpan
+ , mkExternalName u5 pRELUDE f5 noSrcSpan ]
=====================================
testsuite/tests/iface/Lib.hs
=====================================
@@ -0,0 +1,15 @@
+module Lib where
+
+import GHC.Utils.Binary
+import GHC.Iface.Binary
+import qualified Data.ByteString as B
+import System.Environment
+import Data.Maybe
+
+testSize :: Binary a => CompressionIFace -> a -> IO Int
+testSize compLvl payload = do
+ args <- getArgs
+ bh <- openBinMem 1024
+ putWithUserData QuietBinIFace compLvl bh payload
+ withBinBuffer bh (\b -> return (B.length b))
+
=====================================
testsuite/tests/iface/Makefile
=====================================
@@ -0,0 +1,4 @@
+TOP=../..
+include $(TOP)/mk/boilerplate.mk
+include $(TOP)/mk/test.mk
+
=====================================
testsuite/tests/iface/all.T
=====================================
@@ -0,0 +1,24 @@
+test( 'if_faststring'
+ , [ stat_from_file('normal', 5, 'NORMALSIZE')
+ , stat_from_file('medium', 5, 'MEDIUMSIZE')
+ , stat_from_file('full', 5, 'FULLSIZE')
+ , extra_files(["Lib.hs"])]
+ , compile_and_run
+ , ['-package ghc'])
+
+test( 'if_name'
+ , [ stat_from_file('normal', 5, 'NORMALSIZE')
+ , stat_from_file('medium', 5, 'MEDIUMSIZE')
+ , stat_from_file('full', 5, 'FULLSIZE')
+ , extra_files(["Lib.hs", "IfaceSharingName.hs"])]
+ , compile_and_run
+ , ['-package ghc'])
+
+test( 'if_ifacetype'
+ , [ stat_from_file('normal', 5, 'NORMALSIZE')
+ , stat_from_file('medium', 5, 'MEDIUMSIZE')
+ , stat_from_file('full', 5, 'FULLSIZE')
+ , extra_files(["Lib.hs", "IfaceSharingIfaceType.hs"])]
+ , compile_and_run
+ , ['-package ghc'])
+
=====================================
testsuite/tests/iface/if_faststring.hs
=====================================
@@ -0,0 +1,15 @@
+{-# LANGUAGE FlexibleContexts #-}
+{-# LANGUAGE OverloadedStrings #-}
+
+import Lib
+import GHC.Data.FastString
+import GHC.Iface.Binary
+
+main :: IO ()
+main = do
+ sz <- testSize MaximalCompression (concat (replicate 1000 ["abc", "cde", "efg" :: FastString]))
+ writeFile "FULLSIZE" (show sz)
+ sz <- testSize SafeExtraCompression (concat (replicate 1000 ["abc", "cde", "efg" :: FastString]))
+ writeFile "MEDIUMSIZE" (show sz)
+ sz <- testSize NormalCompression (concat (replicate 1000 ["abc", "cde", "efg" :: FastString]))
+ writeFile "NORMALSIZE" (show sz)
=====================================
testsuite/tests/iface/if_ifacetype.hs
=====================================
@@ -0,0 +1,13 @@
+import Lib
+import IfaceSharingIfaceType
+import GHC.Iface.Binary
+
+main :: IO ()
+main = do
+ sz <- testSize MaximalCompression (concat (replicate 500 types))
+ writeFile "FULLSIZE" (show sz)
+ sz <- testSize SafeExtraCompression (concat (replicate 500 types))
+ writeFile "MEDIUMSIZE" (show sz)
+ sz <- testSize NormalCompression (concat (replicate 500 types))
+ writeFile "NORMALSIZE" (show sz)
+
=====================================
testsuite/tests/iface/if_name.hs
=====================================
@@ -0,0 +1,12 @@
+import Lib
+import IfaceSharingName
+import GHC.Iface.Binary
+
+main :: IO ()
+main = do
+ sz <- testSize MaximalCompression (concat (replicate 1000 names))
+ writeFile "FULLSIZE" (show sz)
+ sz <- testSize SafeExtraCompression (concat (replicate 1000 names))
+ writeFile "MEDIUMSIZE" (show sz)
+ sz <- testSize NormalCompression (concat (replicate 1000 names))
+ writeFile "NORMALSIZE" (show sz)
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/57f77a08d347166fac16085e55bcf447ffd6ace0...0823ceaec1f590e127d84ee51a57f300f18d50dc
--
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/57f77a08d347166fac16085e55bcf447ffd6ace0...0823ceaec1f590e127d84ee51a57f300f18d50dc
You're receiving this email because of your account on gitlab.haskell.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20240416/aeb08e36/attachment-0001.html>
More information about the ghc-commits
mailing list