[Git][ghc/ghc][wip/local-symbols-2] 2 commits: codeGen: Produce local symbols for module-internal functions

Ben Gamari gitlab at gitlab.haskell.org
Mon Oct 26 21:22:05 UTC 2020



Ben Gamari pushed to branch wip/local-symbols-2 at Glasgow Haskell Compiler / GHC


Commits:
49c69308 by Ben Gamari at 2020-10-26T17:21:57-04:00
codeGen: Produce local symbols for module-internal functions

It turns out that some important native debugging/profiling tools (e.g.
perf) rely only on symbol tables for function name resolution (as
opposed to using DWARF DIEs). However, previously GHC would emit
temporary symbols (e.g. `.La42b`) to identify module-internal
entities. Such symbols are dropped during linking and therefore not
visible to runtime tools (in addition to having rather un-helpful unique
names). For instance, `perf report` would often end up attributing all
cost to the libc `frame_dummy` symbol since Haskell code was no covered
by any proper symbol (see #17605).

We now rather follow the model of C compilers and emit
descriptively-named local symbols for module internal things. Since this
will increase object file size this behavior can be disabled with the
`-fno-expose-all-symbols` flag.

With this `perf record` can finally be used against Haskell executables.
Even more, with `-g3` `perf annotate` provides inline source code.

- - - - -
6a977ea5 by Ben Gamari at 2020-10-26T17:21:57-04:00
Enable -fexpose-all-symbols when debug level is set

- - - - -


9 changed files:

- compiler/GHC/Cmm/CLabel.hs
- compiler/GHC/CmmToAsm.hs
- compiler/GHC/CmmToAsm/Config.hs
- compiler/GHC/CmmToAsm/Monad.hs
- compiler/GHC/CmmToAsm/X86/Ppr.hs
- compiler/GHC/Driver/Flags.hs
- compiler/GHC/Driver/Session.hs
- docs/users_guide/debug-info.rst
- docs/users_guide/phases.rst


Changes:

=====================================
compiler/GHC/Cmm/CLabel.hs
=====================================
@@ -118,6 +118,7 @@ module GHC.Cmm.CLabel (
         LabelStyle (..),
         pprDebugCLabel,
         pprCLabel,
+        ppInternalProcLabel,
 
         -- * Others
         dynamicLinkerLabelInfo,
@@ -1360,6 +1361,39 @@ pprCLabel platform sty lbl =
    CmmLabel _ _ fs CmmRet      -> maybe_underscore $ ftext fs <> text "_ret"
    CmmLabel _ _ fs CmmClosure  -> maybe_underscore $ ftext fs <> text "_closure"
 
+-- Note [Internal proc labels]
+-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~
+--
+-- Some tools (e.g. the `perf` utility on Linux) rely on the symbol table
+-- for resolution of function names. To help these tools we provide the
+-- (enabled by default) -fexpose-all-symbols flag which causes GHC to produce
+-- symbols even for symbols with are internal to a module (although such
+-- symbols will have only local linkage).
+--
+-- Note that these labels are *not* referred to by code. They are strictly for
+-- diagnostics purposes.
+--
+-- To avoid confusion, it is desireable to add a module-qualifier to the
+-- symbol name. However, the Name type's Internal constructor doesn't carry
+-- knowledge of the current Module. Consequently, we have to pass this around
+-- explicitly.
+
+-- | Generate a label for a procedure internal to a module (if
+-- 'Opt_ExposeAllSymbols' is enabled).
+-- See Note [Internal proc labels].
+ppInternalProcLabel :: Module     -- ^ the current module
+                    -> CLabel
+                    -> Maybe SDoc -- ^ the internal proc label
+ppInternalProcLabel this_mod (IdLabel nm _ flavour)
+  | isInternalName nm
+  = Just
+     $ text "_" <> ppr this_mod
+    <> char '_'
+    <> ztext (zEncodeFS (occNameFS (occName nm)))
+    <> char '_'
+    <> pprUniqueAlways (getUnique nm)
+    <> ppIdFlavor flavour
+ppInternalProcLabel _ _ = Nothing
 
 ppIdFlavor :: IdLabelInfo -> SDoc
 ppIdFlavor x = pp_cSEP <> case x of


=====================================
compiler/GHC/CmmToAsm.hs
=====================================
@@ -1192,5 +1192,6 @@ initNCGConfig dflags this_mod = NCGConfig
    , ncgDwarfEnabled        = debugLevel dflags > 0
    , ncgDwarfUnwindings     = debugLevel dflags >= 1
    , ncgDwarfStripBlockInfo = debugLevel dflags < 2 -- We strip out block information when running with -g0 or -g1.
+   , ncgExposeInternalSymbols = gopt Opt_ExposeInternalSymbols dflags
    }
 


=====================================
compiler/GHC/CmmToAsm/Config.hs
=====================================
@@ -18,8 +18,7 @@ import GHC.Utils.Outputable
 data NCGConfig = NCGConfig
    { ncgPlatform              :: !Platform        -- ^ Target platform
    , ncgAsmContext            :: !SDocContext     -- ^ Context for ASM code generation
-   , ncgThisModule            :: !Module          -- ^ The name of the module we are currently compiling (for generating debug information)
-                                                  -- See Note [Internal proc labels] in CLabel.
+   , ncgThisModule            :: !Module          -- ^ The name of the module we are currently compiling
    , ncgProcAlignment         :: !(Maybe Int)     -- ^ Mandatory proc alignment
    , ncgExternalDynamicRefs   :: !Bool            -- ^ Generate code to link against dynamic libraries
    , ncgPIC                   :: !Bool            -- ^ Enable Position-Independent Code
@@ -40,6 +39,7 @@ data NCGConfig = NCGConfig
    , ncgDwarfEnabled          :: !Bool            -- ^ Enable Dwarf generation
    , ncgDwarfUnwindings       :: !Bool            -- ^ Enable unwindings
    , ncgDwarfStripBlockInfo   :: !Bool            -- ^ Strip out block information from generated Dwarf
+   , ncgExposeInternalSymbols :: !Bool            -- ^ Expose symbol table entries for internal symbols
    }
 
 -- | Return Word size


=====================================
compiler/GHC/CmmToAsm/Monad.hs
=====================================
@@ -80,6 +80,8 @@ data NcgImpl statics instr jumpDest = NcgImpl {
     canShortcut               :: instr -> Maybe jumpDest,
     shortcutStatics           :: (BlockId -> Maybe jumpDest) -> statics -> statics,
     shortcutJump              :: (BlockId -> Maybe jumpDest) -> instr -> instr,
+    -- | 'Module' is only for printing internal labels. See Note [Internal proc
+    -- labels] in CLabel.
     pprNatCmmDecl             :: NatCmmDecl statics instr -> SDoc,
     maxSpillSlots             :: Int,
     allocatableRegs           :: [RealReg],


=====================================
compiler/GHC/CmmToAsm/X86/Ppr.hs
=====================================
@@ -90,6 +90,7 @@ pprNatCmmDecl config proc@(CmmProc top_info lbl _ (ListGraph blocks)) =
         -- special case for code without info table:
         pprSectionAlign config (Section Text lbl) $$
         pprProcAlignment config $$
+        pprProcLabel config lbl $$
         pprLabel platform lbl $$ -- blocks guaranteed not null, so label needed
         vcat (map (pprBasicBlock config top_info) blocks) $$
         (if ncgDwarfEnabled config
@@ -99,6 +100,7 @@ pprNatCmmDecl config proc@(CmmProc top_info lbl _ (ListGraph blocks)) =
     Just (CmmStaticsRaw info_lbl _) ->
       pprSectionAlign config (Section Text info_lbl) $$
       pprProcAlignment config $$
+      pprProcLabel config lbl $$
       (if platformHasSubsectionsViaSymbols platform
           then pdoc platform (mkDeadStripPreventer info_lbl) <> char ':'
           else empty) $$
@@ -114,6 +116,15 @@ pprNatCmmDecl config proc@(CmmProc top_info lbl _ (ListGraph blocks)) =
        else empty) $$
       pprSizeDecl platform info_lbl
 
+-- | Output an internal proc label. See Note [Internal proc labels] in CLabel.
+pprProcLabel :: NCGConfig -> CLabel -> SDoc
+pprProcLabel config lbl
+  | ncgExposeInternalSymbols config
+  , Just lbl' <- ppInternalProcLabel (ncgThisModule config) lbl
+  = lbl' <> char ':'
+  | otherwise
+  = empty
+
 -- | Output the ELF .size directive.
 pprSizeDecl :: Platform -> CLabel -> SDoc
 pprSizeDecl platform lbl


=====================================
compiler/GHC/Driver/Flags.hs
=====================================
@@ -274,6 +274,7 @@ data GeneralFlag
    -- forwards all -L flags to the collect2 command without using a
    -- response file and as such breaking apart.
    | Opt_SingleLibFolder
+   | Opt_ExposeInternalSymbols
    | Opt_KeepCAFs
    | Opt_KeepGoing
    | Opt_ByteCode


=====================================
compiler/GHC/Driver/Session.hs
=====================================
@@ -376,7 +376,6 @@ import qualified GHC.LanguageExtensions as LangExt
 -- -----------------------------------------------------------------------------
 -- DynFlags
 
-
 -- | Used to differentiate the scope an include needs to apply to.
 -- We have to split the include paths to avoid accidentally forcing recursive
 -- includes since -I overrides the system search paths. See #14312.
@@ -3404,6 +3403,7 @@ fFlagsDeps = [
   flagSpec "error-spans"                      Opt_ErrorSpans,
   flagSpec "excess-precision"                 Opt_ExcessPrecision,
   flagSpec "expose-all-unfoldings"            Opt_ExposeAllUnfoldings,
+  flagSpec "expose-internal-symbols"          Opt_ExposeInternalSymbols,
   flagSpec "external-dynamic-refs"            Opt_ExternalDynamicRefs,
   flagSpec "external-interpreter"             Opt_ExternalInterpreter,
   flagSpec "flat-cache"                       Opt_FlatCache,
@@ -4392,7 +4392,13 @@ setVerbosity :: Maybe Int -> DynP ()
 setVerbosity mb_n = upd (\dfs -> dfs{ verbosity = mb_n `orElse` 3 })
 
 setDebugLevel :: Maybe Int -> DynP ()
-setDebugLevel mb_n = upd (\dfs -> dfs{ debugLevel = mb_n `orElse` 2 })
+setDebugLevel mb_n =
+  upd (\dfs -> exposeSyms $ dfs{ debugLevel = n })
+  where
+    n = mb_n `orElse` 2
+    exposeSyms
+      | n > 0     = setGeneralFlag' Opt_ExposeInternalSymbols
+      | otherwise = id
 
 data PkgDbRef
   = GlobalPkgDb


=====================================
docs/users_guide/debug-info.rst
=====================================
@@ -14,6 +14,7 @@ useable by most UNIX debugging tools.
     :category: debugging
 
     :since: 7.10, numeric levels since 8.0
+    :implies: :ghc-flag:`-fexpose-all-symbols`
 
     Emit debug information in object code. Currently only DWARF debug
     information is supported on x86-64 and i386. Currently debug levels 0


=====================================
docs/users_guide/phases.rst
=====================================
@@ -720,6 +720,20 @@ Options affecting code generation
     all target platforms. See the :ghc-flag:`--print-object-splitting-supported`
     flag to check whether your GHC supports object splitting.
 
+.. ghc-flag:: -fexpose-internal-symbols
+    :shortdesc: Produce symbols for all functions, including internal functions.
+    :type: dynamic
+    :category: codegen
+
+    Request that GHC emits verbose symbol tables which include local symbols
+    for module-internal functions. These can be useful for tools like
+    :ref:`perf <https://perf.wiki.kernel.org/>` but increase object file sizes.
+    This is implied by :ghc-flag:`-g2` and above.
+
+    :ghc-flag:`-fno-expose-internal-symbols` suppresses all non-global symbol
+    table entries, resulting in smaller object file sizes at the expense of
+    debuggability.
+
 .. _options-linker:
 
 Options affecting linking



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/566ebbd9ddd7695ce4cb1939d917747a769bc32b...6a977ea5f9ac560568a9a3e5bfa90c0936b25cb9

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/566ebbd9ddd7695ce4cb1939d917747a769bc32b...6a977ea5f9ac560568a9a3e5bfa90c0936b25cb9
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20201026/7a2b1a12/attachment-0001.html>


More information about the ghc-commits mailing list