Revamping the module hierarchy
wren ng thornton
wren at community.haskell.org
Fri Jun 19 06:06:31 EDT 2009
Iavor Diatchki wrote:
> I think that a better way to organize our programs is to prefix the
> modules in a package with the package name. This will avoid the name
> collision issue (or at least, greatly simplify it, because packages
> that are uploaded to hackage need to have different names). It would
> also make the dependencies of a module quite obvious. It would also
> make our import lists much simpler. For example, we would write
> "import HaXml" instead of import "Text.XML.HaXML", or "import
> Parsec.Char" instead of "import Text.ParsingCombinators.Parsec.Char".
> If classifying modules according to their purpose is necessary (and I
> am not sure that it is, if we can do it at the package level), then we
> could think of a more suitable mechanism to achieve that goal then the
> hierarchical names.
I disagree. One of the nice things about the current arrangement is that
the package namespace is orthogonal to the module namespace. These two
concepts really are orthogonal, so it's good to keep them that way. When
they get conflated into one, you end up with Java's import mechanism
which is a complete wreck. When you keep them orthogonal you can have
some really nice package managers like Monticello for Squeak.
I agree with Maurico that what we really need is to have the tools to be
able to rearrange the tree at will. The Haskell language has no business
dealing with the provenance of where modules come from--- and forcing
modules to be named after their packages would make it do so. Currently,
ghc-pkg (or whatever) handles the provenance of making sure that
packages are visible to have their modules be loaded. As it stands, this
provenance mechanism automatically roots all packages at the same place,
but there's no reason it needs to. We just have to come up with the
right DSL for scripting ghc-pkg (or equivalently, the right CLI) to be
able to play around with the module namespace in a more intelligent way.
For instance, let's assume we have:
> ghc-pkg describe libfoo-0.0.0
exposed-modules: Data.Foo Control.Bar Control.Bar.Baz
Now, if we say:
ghc-pkg expose libfoo-0.0.0
Then any Haskell programs can now load the modules mentioned above, by
the names mentioned above. If instead we said something like:
ghc-pkg expose libfoo-0.0.0 at Zot
Then Haskell programs would be able to load the modules by the names
Zot.Data.Foo, Zot.Control.Bar, and Zot.Control.Bar.Baz instead. And if
we wanted to rebase subtrees then we could say something like:
ghc-pkg expose libfoo-0.0.0:Control.Bar as Quux
Which would make the modules Quux and Quux.Baz available for loading,
and would effectively hide libfoo-0.0.0:Data.Foo from being loadable.
To implement this we need to update not only ghc-pkg, but also the Cabal
format. Rather than just specifying which dependent packages must be
exposed, we also need to specify *where* the package expects them to be
exposed in the module namespace. Assuming this is implemented sanely,
then all of the renaming for changing the root and for rebasing subtrees
can be boiled out and undone during the linking phase (that is, when GHC
is "linking" things to follow imports etc; not when ld is actually
linking things). An import declaration is a reference to an actual
compiled module, the name is just a proxy to know where to find it, the
name doesn't have any meaning in itself.
Since every package gets their own local view of the module namespace,
every package can choose their own names for things. Moreover, since
every package must specify their local view, if one wants to have some
crazy jumbled view then the burden is on them to specify how to do it.
Since every package exposes a view of its exposed module namespace, this
serves as the default view. Since it takes work for people to rearrange
things, there will still be a force to give things good names in the
first place. Only we would no longer be stuck with bad decisions.
More information about the Libraries