[GHC] #11293: Compiler plugins don't work with profiling

Tue Nov 29 07:23:10 UTC 2016

#11293: Compiler plugins don't work with profiling
-------------------------------------+-------------------------------------
        Reporter:  ezyang            |                Owner:
            Type:  bug               |               Status:  new
        Priority:  normal            |            Milestone:
       Component:  Profiling         |              Version:  7.11
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  None/Unknown      |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by ezyang):

 I suggest just tackling the profiling problem as a way of avoiding the
 "biting off more than you can chew" problem.  While it's reasonable to try
 to anticipate things the cross-compiling version will need, I also suspect
 that you'll have to rewrite a lot of code anyway in the cross-compiling
 case. So if you anticipate too much, you'll find that you tied yourself up
 in knots for code that you didn't actually need. This is open source; as
 long as you're persistent, we can take the time to rewrite several times
 ;)

 Let's talk about the question of separate or together package databases.
 Common sense seems to dictate two contradictory positions:

 1. Profiling versions of library "ought" to be in the same package
 database (or, at least, that's how we do it today)
 2. Different cross-compiler targets "ought" to be in different package
 databases

 The crux of the matter is what having separate package databases buys you:
 you can have multiple copies of the package with the *same* IPID, but
 compiled differently. Sometimes this occurs in the single package database
 today (think about the profiling and non-profiling versions of a library),
 but these have to be awkwardly squeezed into the same package database
 entry. "Fortunately", the package database entry looks no different
 whether or not profiling libraries are present, so distros that want to
 bundle profiling libraries separately from the real ones just drop in some
 appropriately named p_hi and _p.a files. But it is also a pain for cabal-
 install Nix-style local builds, which doesn't really want to be
 retroactively stuffing profiling libraries into preexisting installations.

 So, *why* would you want to ensure that the profiling and non-profiling
 versions of a library get the same symbol names (via the IPID)? In a world
 where Template Haskell obeys strict staging, I think the answer is, never.
 But when we blur the lines between compile time and run time (as is
 encouraged by our current TH model, where there is no difference between a
 compile time and a runtime dependency), there are situations where you
 have a Name from the compile time world, and you want to lift (well, the
 type-class is called that but IMO it goes the wrong way!) a name from the
 compile-time world to the run-time world. And now the outrageous
 coincidence of compile-time and run-time entities (e.g., profiling and
 non-profiling) being named the same way is very useful, because you can
 just slide it from one realm to the next and still have it mean something.
 (Of course, if a preprocessor macro caused a Name to stop existing, then
 you did something bad and GHC will probably panic. So this kind of punning
 is not very safe--empirically, it seems to have been OK for profiling
 because no one ever CPPs for profiling.)

 So, if you are going to insist on a strict staging separation (I think,
 from GHC's perspective, everything can be implemented correctly this way;
 it's just a matter of user code not using NameG and Typeable in naughty
 ways), then I don't see any reason to have multiple package databases. If
 I build a profiling version of library, it should get its own entry in the
 package database, with a different IPID than its vanilla version (even if
 it's built with the same things), and its own set of dependencies (on the
 profiling versions of its deps.) And if the IPIDs are different, they can
 totally coexist in the same package database. Now you can eagerly error if
 a profiling library is not present (rather than wait for the attempt to
 load p_hi to fail). You can even eliminate the p_  prefix for profiling
 objects/interface files.

 Some "soft" advice on this patchset. I think it is wise to reduce churn
 when making a big patch like this. It makes the patch more likely to
 succeed. So I think there are two reasonable things you can do with this
 analysis:

 1. You can start by refactoring profiling so that it lives in a separate
 database, eliminating p_hi files. This is a lot of churn: Cabal, build
 systems, distro packaging, etc. will all have to update--hopefully for the
 better. While you are doing this, you will run into the blocking problem
 of handling plugins and TH, since they are trying to load code from the
 profiling database which will no longer exist.  So then you'll solve this
 ticket, by adding a second EPS/HPT, etc, as described above.

 2. OR, you could just solve the blocking problem first, introducing a new
 EPS/HPT, but assuming everything is in the same database, and then solve
 that problem later (or not at all), perhaps when you need cross compiling
 to work.

 P.S. You wonder if you will need two databases. Let me flip around your
 question: rather than databases, does it every make sense to have more
 than two EPSes in GHC? The only situation is if GHC is compiling code for
 two targets *at the same time.* I can't think this would ever be relevant
 for cross-compilation, but rather mysteriously GHC does try to generate
 dynamic and non-dynamic objects at the same time using `-dynamic-too`.
 Still, we get along just fine with one EPS even in this case, so I think
 two is enough. Obviously you can maintain more databases on disk, but GHC
 will only ever look at two at a time.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/11293#comment:3>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler