Suggestion for resolving the Cabal/GHC dependency problems
Simon Marlow
marlowsd at gmail.com
Wed Sep 18 18:38:58 UTC 2013
On 11/09/13 17:28, Duncan Coutts wrote:
> All,
>
> I was discussing this with Yuri earlier and I had an idea that I think
> may resolve our problems.
>
> Firstly, what are the problems:
>
> 1. ghc devs and users grumble because the ghc library depends on
> Cabal, making it hard to use the ghc lib with a later Cabal.
> 2. ghc devs grumble generally that Cabal seems quite big but they
> only need small parts of it
> 3. Cabal devs complain that they cannot add useful dependencies
> (like a parser with error messages) because ghc depends on
> Cabal.
>
> Secondly, let us recall why it is that ghc does use Cabal, and where:
>
> 1. it's used by ghc-pkg to read/write the external representation
> of installed package files (external rep is defined by Cabal
> spec, and implemented in the Cabal lib)
> 2. it's used by ghc to read the ghc package database files/dirs.
> These databases use the same external representation, and ghc &
> ghc-pkg use the InstalledPackageInfo type internally
> (InstalledPackageInfo is defined in the Cabal lib).
> 3. it's used by the ghc build system to help with building all the
> libraries that ship with ghc. I believe that this part uses more
> of the build system part of Cabal, not just the types and
> external formats.
> 4. ghc comes with Cabal pre-installed so that users can run
> Setup.hs scripts to install other packages. This was part of the
> original Cabal design: that all compilers would use the
> installed package info format defined by Cabal, and all
> compilers would ship Cabal to users so the Setup.hs mechanism
> will work.
>
> Now, as far as I know, nobody is suggesting that ghc stop shipping
> Cabal, nor that it stop using it as part of the build system.
>
> The problems all centre around use number 2, where the ghc library
> package depends on Cabal. Number 1 isn't really a problem because
> ghc-pkg is an executable.
>
> So my suggestion is quite simple, eliminate the dependency in case 2
> above, but keep it in the other three cases. Specifically:
>
> * ghc will use a new internal type to represent info coming from
> the ghc-pkg databases, ie not InstalledPackageInfo. This can be
> smaller as ghc doesn't care about the metadata.
> * The InstalledPackageInfo and the current need for ghc to read
> its external representation is the main reason the ghc lib
> depends on Cabal. Other dependencies should be minor and easy to
> remove.
> * ghc and ghc-pkg will agree on a new on-disk representation of
> the installed package info.
> * ghc-pkg will continue to depend on Cabal, it will continue to
> use the types and parsers defined by Cabal to read/write the
> InstalledPackageInfo. It will translate from
> InstalledPackageInfo into the on-disk representation that ghc &
> ghc-pkg share.
>
> So what might the on-disk representation for the ghc-pkg databases look
> like? Currently they use the external format of InstalledPackageInfo
> because this is convenient using Cabal.
>
> One simple option is just to store both formats for all packages.
> Another option would be that ghc never reads package dbs where the cache
> is out of date. Then it only ever reads the cache and never has to look
> at the other files. In principle the cache should never be out of date:
> there are two options for updating the db, calling ghc-pkg, or putting
> the file directly and calling ghc-pkg recache (distros often use the
> latter as it is simpler for them). In either case the db cache will be
> up to date. (In fact calling it a cache is not really correct.)
GHC currently always reads the binary cache, even if it is out of date
(I just checked). However, it still also supports the legacy format of
package databases using the Read instance of InstalledPackageInfo. I'm
not sure whether this is still used at all.
We certainly could make another type similar to InstalledPackageInfo,
derive Binary for it, and use that as the package database format. I
think you're right that it's probably easier to do this than to split
out InstalledPackageInfo from Cabal. We would need to make small
package for this that would be shared by ghc-pkg and GHC.
Cheers,
Simon
> So this is a better solution than the one previously proposed to split
> out some small part of Cabal, because in this proposal, ghc doesn't
> depend on Cabal at all, not even some smaller common lib.
>
> It's also better from the point of view of the Cabal folks because it
> does not involve splitting Cabal in unnatural ways. The Cabal folks do
> want to split the Cabal lib, but not in a way that is especially helpful
> to ghc. This suggestion is orthogonal to any Cabal lib splits.
>
> Further, if only ghc-pkg and the ghc build system depend on Cabal, then
> it is easier for Cabal to add more dependencies, since they do not have
> to be installed with ghc (due to the ghc lib depending on them). In
> particular the Cabal folks would like to use a proper parser and have
> suggested adding dependencies on parsec, mtl and transformers. If only
> ghc-pkg depends on Cabal, then these dependencies only need to be used
> at build time, and do not have to be installed (which also means they
> don't have to be kept quite so up to date).
>
>
> Note that this would not address SPJ's complaint that the start of
> building ghc involves building 60+ modules of Cabal. The ghc-cabal tool
> still uses Cabal and I am not suggesting changing that now. It's
> plausible that when the Cabal lib is split that the ghc-cabal tool could
> depend on just the smaller of the two (someone would need to look at how
> much functionality from the "Simple" build system it uses). I don't see
> that this is a big priority however.
>
> Duncan
>
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://www.haskell.org/mailman/listinfo/ghc-devs
>
More information about the ghc-devs
mailing list