Renaming InstalledPackageId

Edward Z. Yang ezyang at mit.edu
Fri Sep 18 03:33:23 UTC 2015


Hello friends,

During discussions with many people about Nix-like Cabal, it has emerged
that InstalledPackageId is /really/ bad name.  Consider: the commonly
accepted definition of an InstalledPackageId in Nix is that it is
morally a hash of all the inputs to compilation: the source code, the
dependencies of the package, and the build configuration.  However, a
Cabal package can have *multiple* components (e.g. the library, the test
suite, etc), each of which has its own 'build-depends' field. The
concept of the "dependencies of a package" is simply not well-defined!

The "simplification" that Cabal has adopted for a long time is to say
that the installed package ID always refers to the library component of
a package. [1]  But the name InstalledPackageId has caused countless
misunderstandings about how dependency resolution is done, even in Cabal's
code. [2]

I propose that we change the name of InstalledPackageId.  The new
name should have the following properties:

1. It should not say anything about packages, because it is not
well-defined for a package, e.g. it should be something like
"ComponentId".

2. It should not say anything about installation, because it is
well-defined even before a package is even built.

3. It should some how communicate that it is a hash of the transitive
source code (e.g. including dependencies) as well as build parameters.
SPJ likes "SourceHash" because it's evocative of this (though I don't
like it as much); there may also be a Nix-y term like "Derivation" we
can use here.

My proposed new name is "ComponentBuildHash", however I am open to other
suggestions.  I might also be convinced by "InstalledComponentId" (which
runs aground (2) but is fairly similar to the old name, and gains points
for familiarity.)  However, I would like to hear your comments: have a
better name? Think this is unnecessary? Please let me know.

Edward

P.S. With Backpack, the ComponentBuildHash won't even be the primary key
into the installed package (to be renamed to a component/unit) database,
because a single ComponentBuildHash can be rebuilt multiple times with
different instantiations of its holes.  So GHC will have some
identifier, which we will probably continue to call the 'UnitKey', which
is the ComponentBuildHash (entirely Cabal generated) plus extra
information about how holes are instantiated (entirely GHC generated).

[1] Except when it doesn't: cabal-install currently merges all the dependencies
of all components that are being built for a package together and treats
that as the sum total dependencies of the package.  This causes problems
when the test suite for text depends on a testing library which in turn
depends on text, c.f. https://github.com/haskell/cabal/issues/960

[2] Here are some bugs caused by confusion of package dependencies
versus component dependency:
https://github.com/haskell/cabal/issues/2802 Specify components when configuring, not building
https://github.com/haskell/cabal/issues/2623 `-j` should build package components in parallel
https://github.com/haskell/cabal/issues/1893 Use per-component cabal_macros.h
https://github.com/haskell/cabal/issues/1575 Do dependency resolution on a per component basis
https://github.com/haskell/cabal/issues/1768 The "benchmark" target dependencies conflict with "executable" targets
https://github.com/haskell/cabal/issues/960 Can't build with --enable-tests in presence of circular dependencies


More information about the cabal-devel mailing list