[GHC] #12891: Automate symbols inclusion in RtsSymbols.c from Rts.h

Tue Apr 3 22:03:31 UTC 2018

#12891: Automate symbols inclusion in RtsSymbols.c from Rts.h
-------------------------------------+-------------------------------------
        Reporter:  Phyx-             |                Owner:  bgamari
            Type:  task              |               Status:  new
        Priority:  normal            |            Milestone:  8.6.1
       Component:  Build System      |              Version:  8.0.1
      Resolution:                    |             Keywords:  newcomer
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  None/Unknown      |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:  #12846            |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by Phyx-):

 Hi,

 Thanks for looking into this again.

 So to give some context, not all symbols in set `B` are actually
 important. `rtsSyms` exists to cover up a deficiency of the fact that the
 runtime linker itself is also a haskell process: We cannot have two
 versions of the rts loaded in one process.

 This means we cannot link against `libRTS` to provide the symbols that are
 required for a Haskell program to run when loading in
 haskell libraries. This works by providing a set of symbols that will be
 loaded from the running Haskell program. Effectively providing the code
 just loaded access the the running RTS.

 Now why don't we just export all symbols from the running process? Because
 we don't want to force the user to have to use the same implementation of
 standard functions that we have chosen, and more importantly, we don't
 want to conflict if the user does specify another implementation. We
 actually do have this specific case, those are the symbols exported by the
 `SymI_HasProto_deprecated` macro but these are a manually curated set. So
 essentially set `B` should be all `SymI` values excluding
 `SymI_HasProto_deprecated` ones.

 However in this particular case it doesn't matter much. But just so you
 know what the symbols do. Now why only `SymI`? because in certain cases,
 like when the code is dynamically linked, the symbols can just be gotten
 from the dynamically loaded shared libraries. In that case the runtime
 linker is not the one providing the symbols, but we do know that they are
 there.

 > Should set "C" exist? As long as it does, a completely algorithmic
 solution will never be possible, because a developer will still be
 required to make a decision about the contents of set "C". In other words,
 the current problem of a developer forgetting to add a symbol to set "B"
 (the rtsSyms[] array in rts/RtsSymbols.c), is replaced by a new problem of
 a developer deciding what symbols belong in set "C". My fix would just
 shift the problem to new location (along with the addition of about 1000
 lines of new code and documentation in GHC's code base).

 Can you give a few examples of which symbols are in set C? one way to
 reduce this set C may be to check the symbols defined in the rts. and
 restrict C to only those symbols that are actually defined in one of the
 RTS libraries. The RTS libraries are ABI compatible (mostly, I think only
 profiling isn't but I don't remember of the top of my head.).

 > The only fix I can think of is to eliminate set "C" by requiring any
 symbol appearing in set "A" to also appear in set "B". Is this feasible?

 Probably not, aside from the runtime cost it carries it also increases the
 risk of symbol collisions with user libraries. But again this really
 depends on what's in C.

 I guess the way forward really depends on what's in C.

-- 
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/12891#comment:11>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler