Build system idea

Thu Sep 4 16:30:15 EDT 2008

On Thu, 2008-09-04 at 09:59 -0700, Iavor Diatchki wrote:
> Hi,
> 
> On Thu, Aug 28, 2008 at 6:59 AM, Simon Marlow <marlowsd at gmail.com> wrote:
> > Because if you *can* use Cabal, you get a lot of value-adds for free (distro
> > packages, cabal-install, Haddock, source distributions, Hackage). What's
> > more, it's really cheap to use Cabal: a .cabal file is typically less than a
> > screenful, so it's no big deal to switch to something else later if you need
> > to.
> 
> Well, I think this illustrates the current thinking of the Haskell
> community, where emphasis has been on making things either easy to do,
> or really hard/impossible to do (a kind of Mac approach to software
> development! :-).  It has the benefit that it makes things seem really
> easy occasionally, but is it really honest?  Concretely:
> 
> cabal-install: it does not work well with packages that have flags
> because it does not know what flags to use when building dependencies.
>  Really, packages with conditionals are different packages in one
> cabal file.

Packages are not supposed to expose different APIs with different flags
so I don't think that's right. Under that assumption cabal-install can
in principle resolve everything fine. I'm not claiming the current
resolution algorithm is very clever when it comes to picking flags
(though it should always pick ones that give an overall valid solution)
but there is certainly scope for a cleverer one. Also, the user can
always specify what features they want, which is what systems like
Gentoo do.

Do you have any specific test cases where the current algorithm is less
than ideal? It'd be useful to report those for the next time someone
hacks on the resolver.

> Haddock:  something seems wrong, if I need to use a specific build
> system to document my code!

You certainly do not need to use Cabal to use haddock. There are other
build systems that integrate support for haddock (eg just makefiles).

> source distributions:  cabal is really of very little help here as one
> has to enumerate everything that should be in the distribution.

I think really in the end that the content does need to be specified.
You can get quite a long way with inferring or discovering dependencies.
Indeed it can work for build, but for source dist you need all the
files, not just the ones used in this current build, so things like:

#ifdef FOO
import Foo
#else
import Bar
#endif 

mean that if we cpp and then chase imports then we're stuffed, we'll
miss one or the other. Trying to discover the deps before cpp is a lost
cause becuase it's not just cpp to think about, there's all the other
pre-processors, standard and custom.

You can get your source control system to do your sdist, eg darcs can do
that. It's great but not necessarily what you always want if you want to
have files that live in your devel repo that are not included in the
source tarball. Also if you want pre-processed files in the tarball it
needs some help from the build system.

> Hackage:  Again, something is wrong if I should have to use a specific
> build system to distribute my code.

No, you only need to make a Cabal package. You can choose whatever build
system you like so long as it presents that standard external command
interface and metadata.

I guess the fact that very few packages do use any build system other
than the "Simple" build system does give the misleading impression that
it's the only choice.

> distro-packages: I have not used these, but the only ones that I have
> heard about are Don's Arch packages, which are not binary packages, so
> there the problem is a bit simpler (still nice that it works though).

Don has done very well recently and generated a lot of excellent
publicity. There are also disto packages maintained for Debian, Fedora,
Gentoo, FreeBSD which have been around for years.

I think Arch packages are binary packages, as are the Debian and Fedora
ones. The FreeBSD, MacPorts and Gentoo packages are of course source
based.

> In summary, it seems to me that there are two or three components that
> are tangled in the term "cabal":
> 1) a machine readable format for describing the meta-data associated
> with a package/application (+ a library that can process this meta
> data).

1a) a standard interface for users and package managers to configure,
build and install a package which can be implemented by multiple build
systems including autoconf+make.

> 2) a build tool that has support for interacting with Haskell
> compilers and other tools that it knows about, to build a package.

Right, a particular implementation of that interface with a bunch of
extra features.

> It seems to me that most of the benefits of cabal come from (1), and
> for most "simple" cases, (2) is just a way to avoid writing a
> completely mundane Makefile, while for more complex cases (2)
> basically doesn't work.

I'm not sure the makefiles were completely mundane, I'd more describe
them as gnarly. :-) I would not underestimate the advantage for simple
projects of not having to re-implement a build system. Being able to use
a standard one and inherit new features and fixes for free is quite and
advantage. There's also the issue of portability. For many packages the
only thing preventing them from working on Windows was the use of make.

Certainly, the Simple build system is not yet up to the task of building
our most complex packages and will require some major surgery before we
get near that. It is something we're working on, though perhaps not
directly inside the current Cabal code base. Saizan's GSoC project was
step 1 in that direction. As I mentioned elsewhere we should also take a
step back and see what we need for Cabal-2, to think about what kind of
design might scale.

Duncan