[GHC] #14697: Redundant computation in fingerprintDynFlags when compiling many modules

GHC ghc-devs at haskell.org
Mon Jan 22 13:14:17 UTC 2018


#14697: Redundant computation in fingerprintDynFlags when compiling many modules
-------------------------------------+-------------------------------------
        Reporter:  niteria           |                Owner:  (none)
            Type:  bug               |               Status:  new
        Priority:  normal            |            Milestone:
       Component:  Compiler          |              Version:
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
 Type of failure:  Compile-time      |  Unknown/Multiple
  performance bug                    |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------
Changes (by niteria):

 * cc: simonmar (added)


Old description:

> I profiled a build of a production code base with thousands of modules
> and computing `fingerprintDynFlags` is `7%` of time and `14%` of
> allocations.
>
> Here's a synthetic test case inspired by what I observed:
> {{{
> SIZE=1000
>
> for i in $(seq -w 1 $SIZE); do
>   echo "module A$i where" > A$i.hs
>   echo "data A$i = A$i" >> A$i.hs
> done
> }}}
>
> This generates a 1000 modules each with one datatype.
> Compiling them with:
> {{{
> inplace/bin/ghc-stage2 A*.hs -optP-D__F{1..10000}__
> }}}
> results in `fingerprintDynFlags` being the top cost centre in the
> profile.
> AFAICT there's only one module dependent piece that goes into computing
> `fingerprintDynFlags` and the rest is the same between those 1000
> modules.
>
> Now, why would I have so many preprocessor flags?
> This is how the Buck build system currently works. If a Haskell library
> depends on a C++ library then the GHC invocation gets the C++ library's
> directory as include path (`-optP-I -optP-I some/library/path`). This can
> grow quite big.

New description:

 I profiled a build of a production code base with thousands of modules and
 computing `fingerprintDynFlags` is `7%` of time and `14%` of allocations.

 Here's a synthetic test case inspired by what I observed:
 {{{
 SIZE=1000

 for i in $(seq -w 1 $SIZE); do
   echo "module A$i where" > A$i.hs
   echo "data A$i = A$i" >> A$i.hs
 done
 }}}

 This generates a 1000 modules each with one datatype.
 Compiling them with:
 {{{
 inplace/bin/ghc-stage2 A*.hs -optP-D__F{1..10000}__
 }}}
 results in `fingerprintDynFlags` being the top cost centre in the profile.
 AFAICT there's only one module dependent piece that goes into computing
 `fingerprintDynFlags` and the rest is the same between those 1000 modules.

 Now, why would I have so many preprocessor flags?
 This is how the Buck build system currently works. If a Haskell library
 depends on a C++ library then the GHC invocation gets the C++ library's
 directory as include path (`-optP -I -optP some/library/path`). This can
 grow quite big.

--

-- 
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/14697#comment:1>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list