first stab at -ffunction-sections

William Lee Irwin III
Mon, 18 Nov 2002 16:12:55 -0800

On Mon, Nov 18, 2002 at 04:57:50PM -0000, Simon Marlow wrote:
> You should be aware that -split-objs (the trick we use to build our
> libraries in lots of little bits) gets most of the benefit that you'd
> get from using -ffunction-sections.  You might get slightly more
> fine-grained discarding of code with -ffunction-sections, but the effect
> won't be dramatic (I'm guessing).  Also there's the issues of telling
> the garbage collector and the mangler about it.
> However, it would be nice to be able to use
> -ffunction-sections/--gc-sections instead of -split-objs.  It's been at
> the back of my mind to have a go at this someday...

I noticed a lot of not-obviously-used stuff brought in from various
libraries and wanted to nuke some of the unneeded things. Step 1 was
trying to compile the libraries with the option, which didn't quite
fly... it looks like ghc-asm is the primary sufferer, and I'm not sure
the compiler option is needed...

    $T_DOT_WORD     = '\.(long|value|byte|zero)';
    $T_DOT_GLOBAL   = '\.globl';
    $T_HDR_literal  = "\.section\t\.rodata\n"; # or just use .text??? (WDP 95/11
    $T_HDR_misc     = "\.text\n\t\.align 4\n";
    $T_HDR_data     = "\.data\n\t\.align 4\n"; # ToDo: change align??
    $T_HDR_consist  = "\.text\n";
    $T_HDR_closure  = "\.data\n\t\.align 4\n"; # ToDo: change align?
    $T_HDR_srt      = "\.text\n\t\.align 4\n"; # ToDo: change align?
    $T_HDR_info     = "\.text\n\t\.align 4\n"; # NB: requires padding
    $T_HDR_entry    = "\.text\n"; # no .align so we're right next to _info (argu
ably wrong...?)
    $T_HDR_fast     = "\.text\n\t\.align 4\n";
    $T_HDR_vector   = "\.text\n\t\.align 4\n"; # NB: requires padding
    $T_HDR_direct   = "\.text\n\t\.align 4\n";
    $T_create_word  = "\t.word";

So basically it's tagging various items directly with the section they
go to as it is, so a compiler option may very well be superfluous; just
hack ghc-asm. Adding some extra goo in mangle_asm() to get the section
name to be related to the symbol doesn't look too bad.

-split-objs I didn't really realize was there. I see (tracing through ghc5,
whatever debian's latest shipping version is):

  ,  ( "split-objs"     , NoArg (if can_split
                                    then do writeIORef v_Split_object_files True
                                            add v_Opt_C "-fglobalise-toplev-name
                                    else hPutStrLn stderr
                                            "warning: don't know how to  split \
                                            \object files on this architecture"
                                ) )

opt_EnsureSplittableC           = lookUp  FSLIT("-fglobalise-toplev-names")

genConInfo comp_info data_con
  =     -- Order of things is to reduce forward references
    mkAbstractCs [if opt_EnsureSplittableC then CSplitMarker else AbsCNop,

maybeExternaliseId id
  | opt_EnsureSplittableC,      -- Externalise the name for -split-objs
    isInternalName name
  = moduleName                           `thenFC` \ mod ->
    returnFC (setIdName id (mkExternalName uniq mod new_occ (nameSrcLoc name)))

  | opt_EnsureSplittableC = CSplitMarker
  | otherwise             = AbsCNop

 gencode CSplitMarker
   | opt_EnsureSplittableC = returnUs (\xs -> StLabel mkSplitMarkerLabel : xs)
   | otherwise             = returnUs id

mkSplitMarkerLabel              = RtsLabel (Rts_Code "__stg_split_marker")

Then in ghc/driver/split/ghc-split.lprl:287 (there's actually one per arch):

    # strip the marker

    $str =~ s/(\.text\n\t\.align .(,0x90)?\n)\.globl\s+.*_?__stg_split_marker.*\
    $str =~ s/(\t\.align .(,0x90)?\n)\.globl\s+.*_?__stg_split_marker.*\n/$1/;  


So to me it looks feasible to figure out who's fooling with these
things, though it's probably not necessary to do any of this within the
compiler except for whatever might circumvent ghc-asm, if anything.

At any rate, I am finding the amount of unused code/data linked into
the generated executables significant... for instance, in a non-
concurrent program:

080a1c64 D MVar_modifyMVarzu_closure
0805aeb8 T MVar_modifyMVarzu_entry
0805aece T MVar_modifyMVarzu_fast3
0805aeb8 T MVar_modifyMVarzu_info

... and as it's a 9-line script to mangle patches, it's certainly not
using this:

0805b140 T __stginit_PosixDB

The idea with -ffunction-sections or brewing up an equivalent is to
build the libraries with it so when the final executable is linked, it
imports only the code and statically-allocated data it uses from them.