Hadrian Transitive Dependencies

Simon Peyton Jones simonpj at microsoft.com
Wed Mar 27 15:06:25 UTC 2019


The underlying think I don’t understand is this:


  *   I assume that the shared cloud cache (SCC) maps input files + command line to outputs.



  *   To have a SCC we really have to list all the input files that a compilation step consults – and that may be many more than the direct imports of the module.   If we compile M which imports A, we will read A.hi; but then we may now have to read B.hi and so on.  Let’s say that M has a deep dependency on B.hi.



Listing all the inputs is a soundness issue: if the SCC is simply a cached map from those inputs to outputs, then missing out an input would be fatal (i.e. unsound).  It’s sound to list too many inputs, but doing will reduce the usefulness of the cache.


  *   In contrast, to get a correct Hadrian build, it suffices to list all the direct imports (= shallow depdendencies) of the thing being compiled.  We’ll bring those up to date and, by implication, all the things it depends on will now also be up to date.

Listing only the direct imports is much less onerous; that’s what ghc -M does.


  *   Early cutoff has something in common with the SCC.   E.g. if we compile A and produce an identical A.hi, we still need to recompile M because B.hi has changed.   GHC already accommodated this by putting B.hi’s fingerprint in A.hi, so if B.hi changes then so will A.hi.

So maybe we don’t need to record those transitive dependencies in the SCC?

It would be extremely onerous to write Hadrian code to make all deep dependencies explicit.  Do we really need to?

Simon

From: ghc-devs <ghc-devs-bounces at haskell.org> On Behalf Of David Eichmann
Sent: 27 March 2019 11:54
To: Neil Mitchell <ndmitchell at gmail.com>; Andrey Mokhov <andrey.mokhov at newcastle.ac.uk>; GHC developers <ghc-devs at haskell.org>
Subject: Hadrian Transitive Dependencies


Hello Shake/Hadrian contributors and the like,

Recently I've been putting Hadrian's fsatrace linting feature to good use, tracking down missing dependencies in Hadrian. Ultimately, we want to use shake's cloud build / shared cache feature and ensure it works across CI builds. Unfortunately the feature isn't working smoothly with Hadrian: see #16295<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.haskell.org%2Fghc%2Fghc%2Fissues%2F16295&data=02%7C01%7Csimonpj%40microsoft.com%7C1638b8b55cbf4ae2862a08d6b2aaeba1%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636892844511642155&sdata=RPU0zZ4fLhvEp4yrBuGVO7ZGtTASGL80qctso%2BcBrgU%3D&reserved=0>. This is very desirable to improve CI build times. It is my understanding that in order to get caching to work:
1. All accessed files must declared with `need` AND
2. All created files must be declared with `produces` (or be the target of the build rule)

Is my understanding correct? Or is there a weaker condition (perhaps only 2 is necessary)?

If I'm correct, this amounts to fixing all fsatrace lint errors. See here<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.haskell.org%2Fghc%2Fghc%2Fissues%2F16400%23note_188901&data=02%7C01%7Csimonpj%40microsoft.com%7C1638b8b55cbf4ae2862a08d6b2aaeba1%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636892844511652163&sdata=G8kxo%2F7D%2Fx2TQHHfzZqcMmh2GyJVmqe%2FQ8kNY8YNHI8%3D&reserved=0> for a breakdown of lint errors / missing dependencies. A large portion of these are Haskell interface files (i.e. *.hi files). Before building a Haskell object file, dependencies are discovered via `ghc` using the `-M  -include-pkg-deps` options. Unfortunately, shake's fsatrace linting complains about other *.hi files being accessed! For example when building `stage1/libraries/mtl/build/Control/Monad/RWS/Class.o` we get the following dependencies from ghc:

_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : libraries/mtl/Control/Monad/RWS/Class.hs

_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/Prelude.hi

_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/Data/Monoid.hi

_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/RWS/Strict.hi

_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/RWS/Lazy.hi

_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Identity.hi

_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Maybe.hi

_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Except.hi

_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Error.hi

_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/lib/../lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Class.hi

_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/libraries/mtl/build/Control/Monad/Writer/Class.hi

_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/libraries/mtl/build/Control/Monad/State/Class.hi

_build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o : _build/stage1/libraries/mtl/build/Control/Monad/Reader/Class.hi

And shake complains of the following missing deps:

_build/stage0/bin/ghc -Wall -hisuf hi -osuf o -hcsuf hc -static -hide-all-packages -no-user-package-db '-package-db _build/stage1/lib/package.conf.d' '-this-unit-id mtl-2.2.2' '-package-id base-4.13.0.0' '-package-id transformers-0.5.5.0' -i -i_build/stage1/libraries/mtl/build -i_build/stage1/libraries/mtl/build/autogen -ilibraries/mtl/. -Iincludes -I_build/generated -I_build/stage1/libraries/mtl/build -I/home/david/ghc/_build/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/include -I/home/david/ghc/_build/stage1/lib/x86_64-linux-ghc-8.9.20190325/integer-gmp-1.0.2.0/include -I/home/david/ghc/_build/stage1/lib/x86_64-linux-ghc-8.9.20190325/rts-1.0/include -I_build/generated -optc-I_build/generated -optP-include -optP_build/stage1/libraries/mtl/build/autogen/cabal_macros.h -outputdir _build/stage1/libraries/mtl/build -Wnoncanonical-monad-instances -optc-Werror=unused-but-set-variable -optc-Wno-error=inline -c libraries/mtl/Control/Monad/RWS/Class.hs -o _build/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o -O2 -H32m -Wall -fno-warn-unused-imports -fno-warn-warnings-deprecations -Wcompat -Wnoncanonical-monad-instances -Wnoncanonical-monadfail-instances -XHaskell2010 -XSafe -ghcversion-file=/home/david/MEGA/File_Dump/Well-Typed/GHC/_nosync_git/ghc/_build/generated/ghcversion.h -Wno-deprecated-flags

Lint checking error - _build/HEAD_default/stage1/libraries/mtl/build/Control/Monad/RWS/Class.o - 22 values were used but not depended upon:

  Used:  _build/HEAD_default/stage0/lib/settings

  Used:  _build/HEAD_default/stage0/lib/platformConstants

  Used:  _build/HEAD_default/stage0/lib/llvm-targets

  Used:  _build/HEAD_default/stage0/lib/llvm-passes

  Used:  _build/HEAD_default/stage0/lib/package.conf.d/package.cache

  Used:  _build/HEAD_default/stage1/lib/package.conf.d/package.cache

  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Float.hi

  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Base.hi

  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/ghc-prim-0.5.3/GHC/Types.hi

  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Maybe.hi

  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Writer/Lazy.hi

  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Writer/Strict.hi

  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/State/Lazy.hi

  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/State/Strict.hi

  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Reader.hi

  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/List.hi

  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/transformers-0.5.5.0/Control/Monad/Trans/Cont.hi

  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/ghc-prim-0.5.3/GHC/Tuple.hi

  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/IO/Exception.hi

  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/integer-gmp-1.0.2.0/GHC/Integer/Type.hi

  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/Data/Either.hi

  Used:  _build/HEAD_default/stage1/lib/x86_64-linux-ghc-8.9.20190325/base-4.13.0.0/GHC/Natural.hi

GHC is reading *.hi files that are not reported as dependencies by `ghc -M  -include-pkg-deps`. This is because they are not direct, but transitive dependencies! How do we fix these lint errors (again with the goal of using shakes shared cache feature)? Some ideas:
* Wildly over approximate dependencies. This may be easier to implement but cause unneeded recompilation (when a false dependency changes). Either:
    * `need` all dependent packages' interface files recursively as well as transitive dependencies reported by `ghc -M  -include-pkg-deps` within the current package. OR
    * OR `need` all transitive dependencies reported by `ghc -M  -include-pkg-deps`. This will likely result in fewer dependencies but requires a bit more work in recovering dependent packages' dependency graphs.
* Perhaps transitive dependencies are not important for shared caching to work. Change shakes linting feature to allow (untracked?) transitive dependencies to be accessed.

Feed back would be greatly appreciated.

David Eichmann



--

David Eichmann, Haskell Consultant

Well-Typed LLP, http://www.well-typed.com<https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.well-typed.com&data=02%7C01%7Csimonpj%40microsoft.com%7C1638b8b55cbf4ae2862a08d6b2aaeba1%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636892844511662172&sdata=BIQVoE%2FaLNjUWBH4bYcNpvFxEESb56bsJ2d8dBDt0rk%3D&reserved=0>



Registered in England & Wales, OC335890

118 Wymering Mansions, Wymering Road, London W9 2NF, England
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20190327/c97f7c25/attachment.html>


More information about the ghc-devs mailing list