Plan for cabal-install

Paolo Martini paolo at nemail.it
Wed Aug 9 12:56:05 EDT 2006


Hello,

during AngloHaskell, me, Duncan, Shae and David had the occasion to  
think about cabal-install (formerly known as cabal-get).

The current situation:

There is a server-program called cabal-server-install which checks  
that a given package is valid and updates the index accordingly. Its  
code can be found here <http://hackage.haskell.org/darcs/cabal-server- 
install/>.

The current index file is a list of {name, description, dependencies,  
url} informations for each package. The hierarchy of packages and the  
corresponding index file can be found here <http:// 
hackage.haskell.org/darcs/cabal-server-install/pub/unstable/>.

The problems:

The current hierarchy is full of outdate packages, this is easy  
solvable after we've made some design decisions about the format of  
the hierarchy and the index file.

The current index file format is a made-up format consisting of a  
certain number of fields, this is a problem because:
  * Cabal is going to support more dependencies information, like  
optional dependencies, which will require a field per-se.
  * For example GUI package managers need to know more than the  
little information now available after an update command.
  * The index file contains URL information for each package, which  
we don't know at the moment of producing that file on the server.

A solution to all these problems is having all the `.cabal' files in  
a tarball as the index file.

This is good for a number of reasons, some of those are:
  * No code is needed to generate the index. One can simple tar the  
directory hierarchy skipping the tarballs. Now, it will only pack the  
`.cabal' files; in future, it easily packs all the other files we  
will eventually need, like `.sig' files for signatures and whatnot.  
This is a big advantage.
  * The tarball format is very simple, it is all ASCII, and an  
Haskell un-tar implementation is very few lines of code, which will  
be needed for Windows.
  * Using the `.cabal' files for all the information means that cabal- 
get is easily {backward,forward}-compatible with changes in the  
`.cabal' files format, because it just relies on the Cabal parser.  
(e.g. optional dependencies in the near(?) future.)
  * We have all the informations from the beginning, after the first  
`cabal-install update' command.

We are quite happy with this solution, but some other decisions need  
to be made:

Now cabal-install saves the server list file in configuration-dir/ 
serv.lst; the index file in conf-dir/pkg.lst; and downloads all the  
tarballs in conf-dir/packages/.

I'd like to discuss what should the local hierarchy of files look  
like with this change in the index file format.

One problem is that we need to support multiple repositories, and  
still map the packages to the right URLs. The proposed solution would  
look like this:

We have a conf-dir/repositories/ directory which contains a  
subdirectories. Those subdirectories are the string-encoded URLs of  
each repository. They contain the local mirror of the remote  
hierarchy produced by simply unpacking the index file; it generates a  
perfect local mirror missing the tarballs. This looks like a sensible  
nice thing.

`cabal-install' will then fetch the informations it needs by looking  
order-sensitively at the server-list-file, then recurse through the  
hierarchy finding all the .cabal files.  This looks like a pratical  
solution which should work well until we get thousands of packages.   
If we ever reach that point and efficiency becomes an issue we could  
simply load the tarball in memory and play with it lazily, mapping in- 
memory-cabal-files to records structures with the informations we need.

What do you think? Does it look like a good plan?

More questions:

I am personally not sure about the OutputGen type which is defined in  
Cabal/Network/Hackage/CabalInstall/Types.hs.  It seem to be evil and  
should be removed.  Anybody has a precise idea about it?

I have some initial code that makes cabal-get work on the tarball  
index, but I am not sure about how to integrate with the OutputGen  
actions, so it is not working now.  I need some explanation about it.

I hope the contents of this email are understandable, please ask for  
whatever I have explained badly or I have assumed for some reason.

Peace,
Paolo.
--



More information about the cabal-devel mailing list