Cabal and simultaneous installations of the same package

Duncan Coutts duncan.coutts at googlemail.com
Mon Mar 23 22:07:32 UTC 2015


On Mon, 2015-03-23 at 08:45 +0000, Simon Peyton Jones wrote:
> Dear Cabal developers
> 
> You'll probably have seen the thread about the Haskell Platform.
> 
> Among other things, this point arose:
> 
> |  Another thing we should fix is the (now false) impression that HP gets in
> |  the way of installing other packages and versions due to cabal hell.
> 
> People mean different things by "cabal hell", but the inability to 
> 	simultaneously install multiple versions of the same package,
> 	compiled against different dependencies
> is certainly one of them, and I think it is the one that Yitzchak is referring to here.
> 
> But some time now GHC has allowed multiple versions of the same
> package (compiled against different dependencies) to be installed
> simultaneously.

That's technically true of existing ghc versions, though ghc-pkg does
not directly allow registering multiple instances of the same version of
a package.

As of 7.10 we can actually use ghc-pkg to register multiple instances,
using ghc-pkg register --enable-multi-instance

Also since 7.10 we can have "environment" files that say what packages
to expose. There can be a per-user default one, or per-user named ones
(which can be used via the ghc command line or via env var), or local
environments in a directory.

While Cabal has always built things with consistent deps and solved the
problem of "which ByteString do I mean", the same has not been true for
GHCi. With this new environment mechanism Cabal can create the
environments and then when the user runs ghc/ghci then they get the set
of packages previously set up by Cabal.

Elsewhere in this thread you say:

> What you want is for the confusing behaviour to be true of GHCi too.
> Well that’s simple enough: ensure that the set of exposed packages (ie
> the ones you say ‘import M’ for), is consistent in the same way.  The
> point is that I may need to install a bunch of packages to build a
> program.  If I’m using Cabal, none of those newly installed packages
> need be exposed; I simply need them there so I can compile my program
> (using Cabal).   But at the moment I can’t do that.

Yes, that is exactly what these environment files are intended for.

Myself and Edsko implemented these features (for the IHG) to enable
future Cabal versions to take the multi instance approach fully.

> So all we need to do is to fix Cabal to allow it too, and thereby kill
> of a huge class of cabal-hell problems at one blow.

With these features now implemented in GHC we're in a position to turn
attention to Cabal to make use of them.

> But time has passed and it hasn't happened. Is this because I'm
> misunderstanding?  Or because it is harder than I think?  Or because
> there are much bigger problems?  Or because there is insufficient
> effort available?  Or what?

There's a number of parts remaining to do.

> Unless I'm way off beam, this "multiple installations of the same
> package" thing has been a huge pain forever, and the solution is
> within our grasp.  What's stopping us grasping it?

Yes, we're closer than ever. I covered more of the details in this blog
post:

http://www.well-typed.com/blog/preview/how-we-might-abolish-cabal-hell-2/

So some of the remaining parts:

Cabal needs to assign proper installed package ids, like nix does. These
should be based on the hash of all inputs to a package build. In
particular that means a hash of the content of the sources. This is easy
for tarballs but a bit harder to do accurately for unpacked build trees.

There are some new UI issues to deal with. The user interface will have
to make this issue of multiple consistent environments be explicit in
the user interface. We need to know what env we are in, what is in it,
what other envs are available and be able to switch between them.

Suppose that we enforce that each environment be fully consistent. Then
when I "cabal install Q" and it cannot find a solution to install
everything in the current environment plus the one extra package such
that they all have consistent deps, then what should it do? Suppose that
Q really could be installed on its own, but cannot be installed
consistently with the other things in this env. Should it suggest that
you make a new environment? There are some details to work out here so
that we don't make a confusing UI.

There's also an unresolved issue about when we try to reuse existing
installed dependencies. One approach is to say that we make install
plans without considering what is already available in the package
store, and then only re-use existing ones if the installed package Ids
happen to match up. The other approach is to say that the solver should
actively try to reuse installed instances when it can. The latter is
what we do now, to try and reduce the number of reinstalls. But when
there are dozens of versions available this is harder: we need to know
more information to know if it is safe to re-use an existing instance.
(There are examples where it's clearly not safe to reuse packages.) Or a
pragmatic approach might be to try and reuse existing installed
instances within the current environment but not actively try to reuse
other instances available in the package store.


On a related topic, along with the London HUG we're trying to organise a
couple infrastructure hackathons in London. The aim would be to work on
Cabal/Hackage related things. The plan is to have two hackathons 6-8
weeks apart, to get new people set up with projects at the first and to
get things merged in the second. For Cabal, this project would be my
focus, trying to get people to work on different aspects of it. I gave a
talk at the London HUG recently about getting involved with hacking on
Cabal/Hackage.

Duncan



More information about the Libraries mailing list