base library and GHC 6.12

Ian Lynagh igloo at earth.li
Thu Jun 25 11:02:14 EDT 2009


Hi all,

There has been some discussion recently about the base library. In
particular, base is supposed to follow the PvP, which means that certain
sorts of changes require increasing the version number. When the version
number is increased, this makes work for lots of people, as any packages
correctly using an upper bound on base's version number need to be
updated.

For modules like Data.List this makes sense, but base also contains a
number of modules in the GHC.* hierarchy which, while exposed, are
really internal, and much less stable than the "public" API.

There is also another issue: Currently, it is not possible for a package
to specify, in its dependencies, whether or not it uses these GHC.*
modules. We therefore can't easily tell whether a package is sensitive
to changes in those modules, or whether it can be used with hugs, nhc98,
etc.

We've come up with 3 possible ways forward. Comments, suggestions and
criticisms welcomed!



Option 1
--------

In order to solve the version number issue, we could simply state that
"base follows the PvP, but only for shared module hierarchies". However,
it would be impossible for packages which /do/ need GHC.* modules to
give accurately versioned dependencies, and it wouldn't solve the other
issue at all.


Option 2
--------

Another possible solution would be to rename the base package to
base-internals, and to make base only re-export the "public" modules.
This would require either renaming each module M to Internal.M, or for
every implementation to support something like GHC's PackageImports
extension.


Option 3
--------

The other alternative is to try to split base into two parts: The shared
"public" modules, and the internal GHC.* modules. Then GHC would have
ghc-base, hugs would have hugs-base, etc, and there would be a common
base package built on top of the appropriate impl-base.

To do this with minimal loss of code sharing is a large task, and hard
to just do in a branch, as merging changes made in the base HEAD is a
pain when the file the patch applies to has moved to another repository.
Thus in the short term we would expect to give up a reasonable amount of
code sharing, but we hope that once we have separate impl-base packages
it will be easier to untangle their contents (as impl-base will be
significantly smaller than base currently is, and because we can
rearrange imports and code inside impl-base without having to worry
about breaking other implementations), and then regain as much sharing
as possible.

I've had a look at what can be done, and the first cut looks like this:

We start off with 143 modules in base, 89 of which are public and 54 of
which are GHC.*.

Afterwards, base has 53 modules, 2 of which are GHC.*:
* GHC.Exts, which could be moved into ghc-base, but would take
            Data.String with it, or it could go into a ghc-exts package.
            As this is "more public" than the other GHC.* modules, this
            seems like a reasonable thing to do anyway
* GHC.Desugar, which could be put into its own package if nothing else

Here's the module graph, which looks quite sane:
    http://community.haskell.org/~igloo/base-small.png

That leaves 90 modules in ghc-base, 52 of which are GHC.*, and 38 of
which are in the portable namespace.

So almost all GHC.* modules have moved, and more than half of the
portable modules are in base.

These 38 are the interesting ones:
    Control.Exception.Base
    Control.Monad
    Data.Bits
    Data.Char
    Data.Dynamic
    Data.Either
    Data.HashTable
    Data.Int
    Data.List
    Data.Maybe
    Data.Tuple
    Data.Typeable
    Data.Word
    Foreign
    Foreign.C
    Foreign.C.Error
    Foreign.C.String
    Foreign.C.Types
    Foreign.ForeignPtr
    Foreign.Marshal
    Foreign.Marshal.Alloc
    Foreign.Marshal.Array
    Foreign.Marshal.Error
    Foreign.Marshal.Pool
    Foreign.Marshal.Utils
    Foreign.Ptr
    Foreign.StablePtr
    Foreign.Storable
    Numeric
    System.IO.Error
    System.IO.Unsafe
    System.Posix.Internals
    System.Posix.Types
    Text.ParserCombinators.ReadP
    Text.ParserCombinators.ReadPrec
    Text.Read.Lex
    Text.Show
    Unsafe.Coerce

Some of them are easy to move back into base, e.g. Foreign contains no
code, and Numeric is only needed so that GHC.Ptr can use showHex for its
Show instance. Some are more integral to the implementation of the rest
of GHC.*.

I've only tried this for amd64/Linux, so it's possible that dependencies
on other platforms could cause additional problems.

Here's the module graph, which is somewhat messier than base's:
    http://community.haskell.org/~igloo/ghc-base-small.png


As with option 2, for each implementation, either impl-base needs to
rename the public modules M to Impl.M, or it needs to implement
something like GHC's PackageImports extension.



Thanks
Ian



More information about the Libraries mailing list