Cabal and GHC: the big picture

Oleg Grenrus oleg.grenrus at iki.fi
Tue Jul 16 11:35:50 UTC 2024


My first comment, which applies across the whole document is

Don't write "package (unit)". Write unit.

Leave the package to be used solely as "A package is the unit of 
distribution and versioning.", and use "unit" consistently for 
compilation units, and/or "component" (or more specifically "library" etc).

The naming of flags is a history artifact.

The key observation is that "package is the unit of distribution" is 
nowadays only a Cabal concept. Only PackageImports and "imprecise" flags 
like "-package" (c.f. "-package-id" which ought to be called "-unit-id") 
in GHC really know or care about that.

Second comment, is that be mindful about `cabal-install` and Cabal 
difference. The "3 Cabal" section is really "3 cabal-install", and e.g. 
stack does things differently.

 > Suppose version 2.3.7 of package P, called P-2.3.7, depends on package Q.

Is therefore wrong. You should write "Suppose version 2.3.7 of library 
P, called "P-2.3.7", depends on library Q".

Also libraries can depend on executables: e.g. happy, GHC doesn't care 
about those dependencies, but Cabal (the library, which does the 
building) does.

 > Each unit has a unit-id, looking like

*may* look. The unit identifier is a random string invented by a build 
tool. It's informative, but it really doesn't matter much.

 > Q: "installed package" means the same as "unit"

Not exactly.

 > Q: "package id" means the same as "unit-id"

I think so. And I'd argue to not use "package id" going forward.

 >  recompiling with no change could change the binary 
(non-determinism). Does that change the unit-id?

It doesn't. Unit-id is invented prior to compilation. Therefore at least 
*interface determinism* is important. Though, cabal-install v2 *never* 
re-install units to store database, so determinism is not a hard 
requirement.

 > A package database can contain many installed versions of the same 
package P, or even of a particular version of P, say P-2.4.3, compiled 
against different dependencies.

Even against the same dependencies, even with the same flags, if for 
some reason the build tool changes the way it computes the unit-id.

Also s/package/library/. Re-call, there exist non-main sublibraries.

 > documentation for -package does not clearly specify how the name of 
the package is mapped to a unit-id.

Important bit to remember about "-package" is that it's a legacy flag, 
not used by tools anymore.
-package-id looks for the unit exactly. -package scans to find a 
matching one, there may be many (and e.g. in case of the same version, 
probably non-deterministic choice is made).

 > This .cabal/store is not a package database.

.cabal/store/<ghc> **is** an ordinary package database.

 > Rather, cabal will invoke ghc with a long list of -package-id 
<unit-id> flags

Yes. This is not mutually exclusive. Package database flags tell where, 
`-package-id` flags tell what units to use.

 > Can a package contain multiple public libraries?

Yes. public/private doesn't matter for GHC though. Cabal enforce the 
dependency visibility. I.e. private/public is a Cabal concept. (The 
visibility is written to interface files, but it's there solely for 
Cabal to figure out what the visibility was. GHC doesn't or at least 
shouldn't use that info).

 > Difference between unit-id and ABI hash?

As far as I remember, unit-id tries to approximate ABI hash. In fact, 
there was a request to have GHC output something like ABI-hash given the 
set of flags. Currently Cabal has an ad-hoc implementation to filter out 
flags which should not affect the ABI of a package (like 
`-fprint-explicit-foralls`. Side note: it would been clearer if flag 
name convention would suggest already whether they affect ABI or not. 
E.g. `-ddump` flags or generally `-d` flags don't, but `-f` flags do, 
except e.g. `-fprint...` which is kind of `-ddump` like flag).

On 16.7.2024 13.20, Simon Peyton Jones wrote:
> Friends
>
> You may remember a recent thread on ghc-devs about GHC and Cabal 
> <https://mail.haskell.org/pipermail/ghc-devs/2024-July/021678.html>.  
> In it I say how I feel I lack the "big picture" of how GHC and Cabal 
> interact, and that my mental model is probably faulty.
>
> Tom Ellis took pity on me, and together we wrote this big-picture 
> overview about how GHC and Cabal interact 
> <https://docs.google.com/document/d/1mQEpV3fYz1pHi64KTnlv8gifh9ONQ-jytk5sIHqnV9U/edit?usp=sharing>.  
> Would you like to:
>
>   * Read it as a consumer.
>       o Does it tell you stuff that is useful?
>       o What else would you like to know?
>       o What is un-clear or missing?
>   * Read it as an expert.
>       o Is it accurate?
>       o Are any bits misleading?
>       o Do the links go to appropriate places?
>       o What other links or resource would be helpful.
>
> It is not intended as a replacement for the GHC user guide, nor the 
> Cabal user guide; rather it is littered with links to those guides 
> which give much fuller details. Rather, it is intended to put you 
> (well, me for one!) in a position where you can more easily make sense 
> of those documents.
>
> We'd love to have your help in improving it.
>
> Simon
>
>
>
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20240716/f7936a00/attachment.html>


More information about the ghc-devs mailing list