[Git][ghc/ghc][wip/marge_bot_batch_merge_job] 2 commits: Escape multiple arguments in the settings file
Marge Bot (@marge-bot)
gitlab at gitlab.haskell.org
Thu Mar 14 23:24:09 UTC 2024
Marge Bot pushed to branch wip/marge_bot_batch_merge_job at Glasgow Haskell Compiler / GHC
Commits:
5bea24cf by Fendor at 2024-03-14T19:24:01-04:00
Escape multiple arguments in the settings file
Uses responseFile syntax.
The issue arises when GHC is installed on windows into a location that
has a space, for example the user name is 'Fake User'.
The $topdir will also contain a space, consequentially.
When we resolve the top dir in the string `-I$topdir/mingw/include`,
then `words` will turn this single argument into `-I/C/Users/Fake` and
`User/.../mingw/include` which trips up our flags.
We avoid this by escaping the $topdir before replacing in GHC.
Add regression test case to count the number of options after variable
expansion took place. Additionally, check escaping works.
- - - - -
9848d580 by Fendor at 2024-03-14T19:24:03-04:00
Fix sharing of 'IfaceTyConInfo' during core to iface type translation
During heap analysis, we noticed that during generation of
'mi_extra_decls' we have lots of duplicates for the instances:
* `IfaceTyConInfo NotPromoted IfaceNormalTyCon`
* `IfaceTyConInfo IsPromoted IfaceNormalTyCon`
which should be shared instead of duplicated. This duplication increased
the number of live bytes by around 200MB while loading the agda codebase
into GHCi.
These instances are created during `CoreToIface` translation, in
particular `toIfaceTyCon`.
The generated core looks like:
toIfaceTyCon
= \ tc_sjJw ->
case $wtoIfaceTyCon tc_sjJw of
{ (# ww_sjJz, ww1_sjNL, ww2_sjNM #) ->
IfaceTyCon ww_sjJz (IfaceTyConInfo ww1_sjNL ww2_sjNM)
}
whichs removes causes the sharing to work propery.
Adding explicit sharing, with NOINLINE annotations, changes the core to:
toIfaceTyCon
= \ tc_sjJq ->
case $wtoIfaceTyCon tc_sjJq of { (# ww_sjNB, ww1_sjNC #) ->
IfaceTyCon ww_sjNB ww1_sjNC
}
which looks much more like sharing is happening.
We confirmed via ghc-debug that all duplications were eliminated and the
number of live bytes are noticeably reduced.
- - - - -
9 changed files:
- compiler/GHC/Iface/Type.hs
- compiler/GHC/Settings/IO.hs
- hadrian/src/Rules/Generate.hs
- + test.hs
- + testsuite/tests/ghc-api/settings-escape/T11938.hs
- + testsuite/tests/ghc-api/settings-escape/T11938.stderr
- + testsuite/tests/ghc-api/settings-escape/all.T
- + testsuite/tests/ghc-api/settings-escape/ghc-install-folder/lib/settings
- + testsuite/tests/ghc-api/settings-escape/ghc-install-folder/mingw/.gitkeep
Changes:
=====================================
compiler/GHC/Iface/Type.hs
=====================================
@@ -361,12 +361,51 @@ data IfaceTyConInfo -- Used only to guide pretty-printing
, ifaceTyConSort :: IfaceTyConSort }
deriving (Eq)
--- This smart constructor allows sharing of the two most common
--- cases. See #19194
+-- | This smart constructor allows sharing of the two most common
+-- cases. See Note [Sharing IfaceTyConInfo]
mkIfaceTyConInfo :: PromotionFlag -> IfaceTyConSort -> IfaceTyConInfo
-mkIfaceTyConInfo IsPromoted IfaceNormalTyCon = IfaceTyConInfo IsPromoted IfaceNormalTyCon
-mkIfaceTyConInfo NotPromoted IfaceNormalTyCon = IfaceTyConInfo NotPromoted IfaceNormalTyCon
-mkIfaceTyConInfo prom sort = IfaceTyConInfo prom sort
+mkIfaceTyConInfo IsPromoted IfaceNormalTyCon = promotedNormalTyConInfo
+mkIfaceTyConInfo NotPromoted IfaceNormalTyCon = notPromotedNormalTyConInfo
+mkIfaceTyConInfo prom sort = IfaceTyConInfo prom sort
+
+{-# NOINLINE promotedNormalTyConInfo #-}
+-- | See Note [Sharing IfaceTyConInfo]
+promotedNormalTyConInfo :: IfaceTyConInfo
+promotedNormalTyConInfo = IfaceTyConInfo IsPromoted IfaceNormalTyCon
+
+{-# NOINLINE notPromotedNormalTyConInfo #-}
+-- | See Note [Sharing IfaceTyConInfo]
+notPromotedNormalTyConInfo :: IfaceTyConInfo
+notPromotedNormalTyConInfo = IfaceTyConInfo NotPromoted IfaceNormalTyCon
+
+{-
+Note [Sharing IfaceTyConInfo]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+'IfaceTyConInfo' occurs an awful lot in 'ModIface', see #19194 for an example.
+But almost all of them are
+
+ IfaceTyConInfo IsPromoted IfaceNormalTyCon
+ IfaceTyConInfo NotPromoted IfaceNormalTyCon.
+
+The smart constructor `mkIfaceTyConInfo` arranges to share these instances,
+thus:
+
+ promotedNormalTyConInfo = IfaceTyConInfo IsPromoted IfaceNormalTyCon
+ notPromotedNormalTyConInfo = IfaceTyConInfo NotPromoted IfaceNormalTyCon
+
+ mkIfaceTyConInfo IsPromoted IfaceNormalTyCon = promotedNormalTyConInfo
+ mkIfaceTyConInfo NotPromoted IfaceNormalTyCon = notPromotedNormalTyConInfo
+ mkIfaceTyConInfo prom sort = IfaceTyConInfo prom sort
+
+But ALAS, the (nested) CPR transform can lose this sharing, completely
+negating the effect of `mkIfaceTyConInfo`: see #24530 and #19326.
+
+Sticking-plaster solution: add a NOINLINE pragma to those top-level constants.
+When we fix the CPR bug we can remove the NOINLINE pragmas.
+
+This one change leads to an 15% reduction in residency for GHC when embedding
+'mi_extra_decls': see !12222.
+-}
data IfaceMCoercion
= IfaceMRefl
=====================================
compiler/GHC/Settings/IO.hs
=====================================
@@ -16,9 +16,11 @@ import GHC.Utils.CliOption
import GHC.Utils.Fingerprint
import GHC.Platform
import GHC.Utils.Panic
+import GHC.ResponseFile
import GHC.Settings
import GHC.SysTools.BaseDir
+import Data.Char
import Control.Monad.Trans.Except
import Control.Monad.IO.Class
import qualified Data.Map as Map
@@ -72,9 +74,13 @@ initSettings top_dir = do
-- just partially applying those functions and throwing 'Left's; they're
-- written in a very portable style to keep ghc-boot light.
let getSetting key = either pgmError pure $
- getRawFilePathSetting top_dir settingsFile mySettings key
+ -- Escape the 'top_dir', to make sure we don't accidentally introduce an
+ -- unescaped space
+ getRawFilePathSetting (escapeArg top_dir) settingsFile mySettings key
getToolSetting :: String -> ExceptT SettingsError m String
- getToolSetting key = expandToolDir useInplaceMinGW mtool_dir <$> getSetting key
+ -- Escape the 'mtool_dir', to make sure we don't accidentally introduce
+ -- an unescaped space
+ getToolSetting key = expandToolDir useInplaceMinGW (fmap escapeArg mtool_dir) <$> getSetting key
targetPlatformString <- getSetting "target platform string"
cc_prog <- getToolSetting "C compiler command"
cxx_prog <- getToolSetting "C++ compiler command"
@@ -91,10 +97,10 @@ initSettings top_dir = do
let unreg_cc_args = if platformUnregisterised platform
then ["-DNO_REGS", "-DUSE_MINIINTERPRETER"]
else []
- cpp_args = map Option (words cpp_args_str)
- hs_cpp_args = map Option (words hs_cpp_args_str)
- cc_args = words cc_args_str ++ unreg_cc_args
- cxx_args = words cxx_args_str
+ cpp_args = map Option (unescapeArgs cpp_args_str)
+ hs_cpp_args = map Option (unescapeArgs hs_cpp_args_str)
+ cc_args = unescapeArgs cc_args_str ++ unreg_cc_args
+ cxx_args = unescapeArgs cxx_args_str
-- The extra flags we need to pass gcc when we invoke it to compile .hc code.
--
@@ -135,12 +141,12 @@ initSettings top_dir = do
let as_prog = cc_prog
as_args = map Option cc_args
ld_prog = cc_prog
- ld_args = map Option (cc_args ++ words cc_link_args_str)
+ ld_args = map Option (cc_args ++ unescapeArgs cc_link_args_str)
ld_r_prog <- getToolSetting "Merge objects command"
ld_r_args <- getToolSetting "Merge objects flags"
let ld_r
| null ld_r_prog = Nothing
- | otherwise = Just (ld_r_prog, map Option $ words ld_r_args)
+ | otherwise = Just (ld_r_prog, map Option $ unescapeArgs ld_r_args)
llvmTarget <- getSetting "LLVM target"
@@ -261,3 +267,19 @@ getTargetPlatform settingsFile settings = do
, platformHasLibm = targetHasLibm
, platform_constants = Nothing -- will be filled later when loading (or building) the RTS unit
}
+
+-- ----------------------------------------------------------------------------
+-- Escape Args helpers
+-- ----------------------------------------------------------------------------
+
+-- | Just like 'GHC.ResponseFile.escapeArg', but it is not exposed from base.
+escapeArg :: String -> String
+escapeArg = reverse . foldl' escape []
+
+escape :: String -> Char -> String
+escape cs c
+ | isSpace c
+ || '\\' == c
+ || '\'' == c
+ || '"' == c = c:'\\':cs -- n.b., our caller must reverse the result
+ | otherwise = c:cs
=====================================
hadrian/src/Rules/Generate.hs
=====================================
@@ -6,6 +6,7 @@ module Rules.Generate (
) where
import Development.Shake.FilePath
+import Data.Char (isSpace)
import qualified Data.Set as Set
import Base
import qualified Context
@@ -416,7 +417,7 @@ generateSettings = do
, ("Use interpreter", expr $ yesNo <$> ghcWithInterpreter)
, ("Support SMP", expr $ yesNo <$> targetSupportsSMP)
- , ("RTS ways", unwords . map show . Set.toList <$> getRtsWays)
+ , ("RTS ways", escapeArgs . map show . Set.toList <$> getRtsWays)
, ("Tables next to code", queryTarget (yesNo . tgtTablesNextToCode))
, ("Leading underscore", queryTarget (yesNo . tgtSymbolsHaveLeadingUnderscore))
, ("Use LibFFI", expr $ yesNo <$> useLibffiForAdjustors)
@@ -431,23 +432,23 @@ generateSettings = do
++ ["]"]
where
ccPath = prgPath . ccProgram . tgtCCompiler
- ccFlags = unwords . prgFlags . ccProgram . tgtCCompiler
+ ccFlags = escapeArgs . prgFlags . ccProgram . tgtCCompiler
cxxPath = prgPath . cxxProgram . tgtCxxCompiler
- cxxFlags = unwords . prgFlags . cxxProgram . tgtCxxCompiler
- clinkFlags = unwords . prgFlags . ccLinkProgram . tgtCCompilerLink
+ cxxFlags = escapeArgs . prgFlags . cxxProgram . tgtCxxCompiler
+ clinkFlags = escapeArgs . prgFlags . ccLinkProgram . tgtCCompilerLink
linkSupportsNoPie = yesNo . ccLinkSupportsNoPie . tgtCCompilerLink
cppPath = prgPath . cppProgram . tgtCPreprocessor
- cppFlags = unwords . prgFlags . cppProgram . tgtCPreprocessor
+ cppFlags = escapeArgs . prgFlags . cppProgram . tgtCPreprocessor
hsCppPath = prgPath . hsCppProgram . tgtHsCPreprocessor
- hsCppFlags = unwords . prgFlags . hsCppProgram . tgtHsCPreprocessor
+ hsCppFlags = escapeArgs . prgFlags . hsCppProgram . tgtHsCPreprocessor
mergeObjsPath = maybe "" (prgPath . mergeObjsProgram) . tgtMergeObjs
- mergeObjsFlags = maybe "" (unwords . prgFlags . mergeObjsProgram) . tgtMergeObjs
+ mergeObjsFlags = maybe "" (escapeArgs . prgFlags . mergeObjsProgram) . tgtMergeObjs
linkSupportsSingleModule = yesNo . ccLinkSupportsSingleModule . tgtCCompilerLink
linkSupportsFilelist = yesNo . ccLinkSupportsFilelist . tgtCCompilerLink
linkSupportsCompactUnwind = yesNo . ccLinkSupportsCompactUnwind . tgtCCompilerLink
linkIsGnu = yesNo . ccLinkIsGnu . tgtCCompilerLink
arPath = prgPath . arMkArchive . tgtAr
- arFlags = unwords . prgFlags . arMkArchive . tgtAr
+ arFlags = escapeArgs . prgFlags . arMkArchive . tgtAr
arSupportsAtFile' = yesNo . arSupportsAtFile . tgtAr
arSupportsDashL' = yesNo . arSupportsDashL . tgtAr
ranlibPath = maybe "" (prgPath . ranlibProgram) . tgtRanlib
@@ -571,3 +572,19 @@ generatePlatformHostHs = do
, "hostPlatformArchOS :: ArchOS"
, "hostPlatformArchOS = ArchOS hostPlatformArch hostPlatformOS"
]
+
+-- | Just like 'GHC.ResponseFile.escapeArgs', but use spaces instead of newlines
+-- for splitting elements.
+escapeArgs :: [String] -> String
+escapeArgs = unwords . map escapeArg
+
+escapeArg :: String -> String
+escapeArg = reverse . foldl' escape []
+
+escape :: String -> Char -> String
+escape cs c
+ | isSpace c
+ || '\\' == c
+ || '\'' == c
+ || '"' == c = c:'\\':cs -- n.b., our caller must reverse the result
+ | otherwise = c:cs
=====================================
test.hs
=====================================
@@ -0,0 +1,14 @@
+import Data.Char
+import Data.Foldable
+-- | Just like 'GHC.ResponseFile.escapeArg', but it is not exposed from base.
+escapeArg :: String -> String
+escapeArg = reverse . foldl' escape []
+
+escape :: String -> Char -> String
+escape cs c
+ | isSpace c
+ || '\\' == c
+ || '\'' == c
+ || '"' == c = c:'\\':cs -- n.b., our caller must reverse the result
+ | otherwise = c:cs
+
=====================================
testsuite/tests/ghc-api/settings-escape/T11938.hs
=====================================
@@ -0,0 +1,73 @@
+
+import GHC.Settings
+import GHC.Settings.IO
+import GHC.Utils.CliOption (Option, showOpt)
+
+import Control.Monad.Trans.Except (runExceptT)
+import Data.Maybe (fromJust)
+import System.Directory (makeAbsolute)
+import System.IO (hPutStrLn, stderr)
+import System.Exit (exitWith, ExitCode(ExitFailure))
+
+-- Precondition: this test case must be executed in a directory with a space.
+main :: IO ()
+main = do
+ topDir <- makeAbsolute "./ghc-install-folder/lib"
+ settingsm <- runExceptT $ initSettings topDir
+
+ case settingsm of
+ Left (SettingsError_MissingData msg) -> do
+ hPutStrLn stderr $ "WARNING: " ++ show msg
+ hPutStrLn stderr $ "dont know target platform"
+ exitWith $ ExitFailure 1
+ Left (SettingsError_BadData msg) -> do
+ hPutStrLn stderr msg
+ exitWith $ ExitFailure 1
+ Right settings -> do
+ let
+ recordSetting :: String -> (Settings -> [String]) -> IO ()
+ recordSetting label selector = do
+ let opts = selector settings
+ -- At least one of the options must contain a space
+ containsSpaces = any (' ' `elem`) opts
+ hPutStrLn stderr $ "=== Number of '" <> label <> "' options: " ++ show (length opts)
+ hPutStrLn stderr $ " Contains spaces: " ++ show containsSpaces
+
+ recordFpSetting :: String -> (Settings -> String) -> IO ()
+ recordFpSetting label selector = do
+ let fp = selector settings
+ containsOnlyEscapedSpaces ('\\':' ':xs) = containsOnlyEscapedSpaces xs
+ containsOnlyEscapedSpaces (' ':_) = False
+ containsOnlyEscapedSpaces [] = True
+ containsOnlyEscapedSpaces (_:xs) = containsOnlyEscapedSpaces xs
+
+ -- Filepath may only contain escaped spaces
+ containsSpaces = containsOnlyEscapedSpaces fp
+ hPutStrLn stderr $ "=== FilePath '" <> label <> "' contains only escaped spaces: " ++ show containsSpaces
+
+ -- Assertions
+ -- Assumption: this test case is executed in a directory with a space.
+
+ -- Setting 'Haskell CPP flags' contains '$topdir' and '$tooldir' references.
+ -- Resolving those while containing spaces, should not introduce more options.
+ -- '$tooldir' will only be expanded in windows, while '$topdir' is always expanded.
+ recordSetting "Haskell CPP flags" (map showOpt . snd . toolSettings_pgm_P . sToolSettings)
+ -- Setting 'C compiler flags' contains strings with spaces.
+ -- GHC should not split these by word.
+ recordSetting "C compiler flags" (toolSettings_opt_c . sToolSettings)
+ -- Setting 'C compiler link flags' contains strings with spaces.
+ -- GHC should not split these by word.
+ recordSetting "C compiler link flags" (map showOpt . snd . toolSettings_pgm_l . sToolSettings)
+ -- Setting 'C++ compiler flags' contains strings with spaces.
+ -- GHC should not split these by word.
+ recordSetting "C++ compiler flags" (toolSettings_opt_cxx . sToolSettings)
+ -- Setting 'CPP Flags' contains strings with spaces.
+ -- GHC should not split these by word.
+ recordSetting "CPP Flags" (map showOpt . snd . toolSettings_pgm_cpp . sToolSettings)
+ -- Setting 'Merge objects flags' contains strings with spaces.
+ -- GHC should not split these by word.
+ -- We know in this case, that 'toolSettings_pgm_lm' is 'Just'
+ recordSetting "Merge objects flags" (map showOpt . snd . fromJust . toolSettings_pgm_lm . sToolSettings)
+ -- Setting 'unlit command' contains '$topdir' reference.
+ -- Resolving those while containing spaces, should be escaped correctly.
+ recordFpSetting "unlit command" (toolSettings_pgm_L . sToolSettings)
=====================================
testsuite/tests/ghc-api/settings-escape/T11938.stderr
=====================================
@@ -0,0 +1,13 @@
+=== Number of 'Haskell CPP flags' options: 5
+ Contains spaces: True
+=== Number of 'C compiler flags' options: 3
+ Contains spaces: True
+=== Number of 'C compiler link flags' options: 7
+ Contains spaces: True
+=== Number of 'C++ compiler flags' options: 2
+ Contains spaces: True
+=== Number of 'CPP Flags' options: 3
+ Contains spaces: True
+=== Number of 'Merge objects flags' options: 3
+ Contains spaces: True
+=== FilePath 'unlit command' contains only escaped spaces: True
=====================================
testsuite/tests/ghc-api/settings-escape/all.T
=====================================
@@ -0,0 +1 @@
+test('T11938', [normal, extra_files(['ghc-install-folder/'])], compile_and_run, ['-package ghc -package directory -package transformers'])
=====================================
testsuite/tests/ghc-api/settings-escape/ghc-install-folder/lib/settings
=====================================
@@ -0,0 +1,51 @@
+[("C compiler command", "/nix/store/lcf37pgp3rgww67v9x2990hbfwx96c1w-gcc-wrapper-12.2.0/bin/cc")
+,("C compiler flags", "-O2 \"-some option\" -some\\ other")
+,("C++ compiler command", "/nix/store/lcf37pgp3rgww67v9x2990hbfwx96c1w-gcc-wrapper-12.2.0/bin/g++")
+,("C++ compiler flags", "\"-some option\" -some\\ other")
+,("C compiler link flags", "-fuse-ld=gold -Wl,--no-as-needed \"-some option\" -some\\ other")
+,("C compiler supports -no-pie", "YES")
+,("CPP command", "/nix/store/lcf37pgp3rgww67v9x2990hbfwx96c1w-gcc-wrapper-12.2.0/bin/cc")
+,("CPP flags", "-E \"-some option\" -some\\ other")
+,("Haskell CPP command", "/nix/store/lcf37pgp3rgww67v9x2990hbfwx96c1w-gcc-wrapper-12.2.0/bin/cc")
+,("Haskell CPP flags", "-E -undef -traditional -I$topdir/ -I$tooldir/")
+,("ld supports compact unwind", "NO")
+,("ld supports filelist", "NO")
+,("ld supports single module", "NO")
+,("ld is GNU ld", "YES")
+,("Merge objects command", "/nix/store/lcf37pgp3rgww67v9x2990hbfwx96c1w-gcc-wrapper-12.2.0/bin/ld.gold")
+,("Merge objects flags", "-r \"-some option\" -some\\ other")
+,("Merge objects supports response files", "YES")
+,("ar command", "/nix/store/lcf37pgp3rgww67v9x2990hbfwx96c1w-gcc-wrapper-12.2.0/bin/ar")
+,("ar flags", "q")
+,("ar supports at file", "YES")
+,("ar supports -L", "NO")
+,("ranlib command", "/nix/store/lcf37pgp3rgww67v9x2990hbfwx96c1w-gcc-wrapper-12.2.0/bin/ranlib")
+,("otool command", "otool")
+,("install_name_tool command", "install_name_tool")
+,("touch command", "touch")
+,("windres command", "/bin/false")
+,("unlit command", "$topdir/../bin/unlit")
+,("cross compiling", "NO")
+,("target platform string", "x86_64-unknown-linux")
+,("target os", "OSLinux")
+,("target arch", "ArchX86_64")
+,("target word size", "8")
+,("target word big endian", "NO")
+,("target has GNU nonexec stack", "YES")
+,("target has .ident directive", "YES")
+,("target has subsections via symbols", "NO")
+,("target has libm", "YES")
+,("Unregisterised", "NO")
+,("LLVM target", "x86_64-unknown-linux")
+,("LLVM llc command", "llc")
+,("LLVM opt command", "opt")
+,("LLVM llvm-as command", "clang")
+,("Use inplace MinGW toolchain", "NO")
+,("Use interpreter", "YES")
+,("Support SMP", "YES")
+,("RTS ways", "v thr thr_debug thr_debug_dyn thr_dyn debug debug_dyn dyn")
+,("Tables next to code", "YES")
+,("Leading underscore", "NO")
+,("Use LibFFI", "NO")
+,("RTS expects libdw", "YES")
+]
=====================================
testsuite/tests/ghc-api/settings-escape/ghc-install-folder/mingw/.gitkeep
=====================================
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/e0a1d5202af25a49398f266cc011ac281ff8f914...9848d58006f7361b847e4991d899fdc5a06d75ca
--
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/e0a1d5202af25a49398f266cc011ac281ff8f914...9848d58006f7361b847e4991d899fdc5a06d75ca
You're receiving this email because of your account on gitlab.haskell.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20240314/9be5c5cb/attachment-0001.html>
More information about the ghc-commits
mailing list