The base library and GHC 6.10

Ian Lynagh igloo at earth.li
Thu Aug 28 07:12:32 EDT 2008


Hi all,

We're trying to decide what to do with the base library for GHC 6.10, in
terms of how much of it should be broken up into separate packages.
Since the recent proposal about this, we may be rethinking what we want
to do, and we would welcome your opinions.



First, the motivation for splitting base up:

It becomes possible to separately upgrade the parts, and makes it easier
for different people to maintain different parts.

It makes it easier to see what the hierarchy is, and to restructure the
hierarchy, and work towards more of the code being shared between
different Haskell implementations. Plus it means that people can't
re-tangle the logically separate components, which is all too easy to do
when you just have one huge package.

It also means that packages are clearer about what they depend on. One
possibility, which would be really cool, is to separate all the IO
modules from the non-IO modules; between that and looking at the
extensions used (e.g. TH and FFI) it would then be clear whether or not
a library could do any IO. Of course, the Prelude is a hurdle for this
goal.

Also, GHC's current plan for the base library:
http://hackage.haskell.org/trac/ghc/wiki/DarcsConversion#Planforlibraries
essentially means forking base (as nhc98 would continue to use base in a
darcs repo, while GHC would use it from a git repo, and there are no
plans for any merging between these repos). Therefore any code that is
to be shared between the implementations needs to not be in base, so
from that point of view it would be good to pull out as much as
possible.

The main argument /against/ splitting base up is that at some point the
dependencies of packages need to be updated to reflect the changes.
However, GHC 6.10 will come with a base version 3, as well as the new
base version 4, so the transition should be much smoother than the base
2 -> base 3 transition.



Now, on with the proposed splitting. In the below, LoC stands for "Lines
of Code".


First the easy bit: The Data.Generics hierarchy is going to have a
separate maintainer, and I think that everyone is agreed that it should
be pulled out into an "syb package". I'll treat this as not part of base
from here on.

The only thing still being debated here is whether the Data class itself
should remain in base or not. Some people believe that it should remain
in base, as it is desirable to have Data instances for as many types as
possible, and because there is a resistance among library writers
against adding dependencies. The counter argument is that there are many
other classes that the same is true of (e.g. uniplate, syb-with-class,
binary), and it does not scale to put all of these classes into base.
Also, by requiring a dep to be added even for the classes that have
historically been included in base, adding dependencies for the sake of
providing instances may become more socially acceptable.



Now, on with the splitting. We have
    System.Console.GetOpt
    (129 LoC, 1 module)
This doesn't really fit in with anything else in base, so the proposal
is to split it off into its own getopt package. I don't think there is
much objection to this one.



Next we have the
    Control.Monad.ST
    Data.STRef
    (120 LoC, 6 modules)
hierarchies. The proposal is to put these into an st package. The
low-level implementation is still in base (69 LoC of in the GHC.ST and
GHC.STRef), so to some extent this is a false separation. On the other
hand, nhc98 doesn't support ST, so splitting this package off gets us
closer to all implementations exposing the same modules from base.



Then we have
    Control.Concurrent
    (490 LoC, 6 modules)
hierarchy, along with
    System.Timeout (39 LoC)
    Data.Unique (32 LoC)
(those modules depend on Control.Concurrent.*). The proposal is to put
these into concurrent, timout and unique packages respectively. Again,
this is a false separation, with 698 LoC left behind in GHC.Conc; at
some time we'd hope that this could either be moved down to ghc-prim, or
make a new ghc-concurrent package for it, depending on how the
dependencies work out. Again, nhc doesn't support concurrent or its
dependencies, so this gets us closer to a consistent base interface.



Splitting off the above 5 packages would leave 106 modules and 16621 LoC
in base. About 5% of the LoC, and 12.5% of the modules, would be in the
new packages.




Thanks
Ian



More information about the Libraries mailing list