Module system, was Re: GHC source code improvement ideas

Fri Jan 4 19:26:41 EST 2008

Ian Lynagh wrote:
> On Fri, Jan 04, 2008 at 08:34:22AM +0000, Simon Peyton-Jones wrote:
>> |   4. A more radical change would be introducing hierarchical modules.
>>
> 
> It's a pity that GHC.* is already used in base. I'm not sure what the
> best thing to do is in the short term.
> 

How about Language.Haskell.Compiler.GHC.*

In the long term, Haskell needs a better module system IMHO, since the 
problem at the moment is that having to write the full hierarchical name 
in each module means that you need to know in advance, before you've 
even started writing a program, where each module will fit into the 
global namespace, which makes it extraordinarily difficult to do 
exploratory programming or bottom-up development, and the need to write 
so many import directives in each module makes it extremely painful not 
to mention overly hard-wired.

A good direction IMHO is the way SML structures are grouped using 
CM.make (SML/NJ) or MLBasis (MLton and others), so that there are no 
import directives at all: a separate group definition determines what is 
in scope, and compilation proceeds by chasing these group definition 
files rather than chasing modules.

Translating this into Haskell, the module syntax could be simplified and 
other modules could be nested and aliased within a module, so that it is 
easy to create different hierarchical views of the modules which are in 
scope to suit the particular module being written or clients of that module:

	-- Leaf.hs
	module Leaf where
		module String where
			open Data.ByteString.Char8
			type T = ByteString

		module Map = Data.Map

		module Util where
			foo :: Int -> Int
			foo i = i

	--Other.hs
	module Other where
		bar :: Int -> Leaf.String.T
		bar = Leaf.String.pack . show . Leaf.Util.foo

Note that here there is no need for any import directives, since the 
modules which are in scope when Leaf.hs is compiled would be determined 
by whatever group Leaf.hs is part of (with respect to that particular 
program), which would be defined in a separate file:

	--MyBasis.hsg
	local $Haskell/StdBase.hsg

	Leaf.hs
	Other.hs

Within the .hsg files, groups and modules are referenced by filename, 
and is just a simple list of components that are required and/or 
exported by the group. In the .hsg file above, MyBasis will export the 
contents of Leaf.hs and Other.hs, which are compiled in an environment 
augmented by StdBase (which is not itself exported by MyBasis).

(See CM.make and the MLBasis system for more details - in particular, 
for any given program a module may appear in only one group, but the 
same module may appear in different groups in different programs thus 
facilitating easy exploratory programming and re-organization without 
breaking previous programs.)

This can be taken further to get an even more powerful system by using 
parameterized traits and explicit instantiation for modules:

	trait T a b where
		foo :: a -> b

		bar :: Int -> Int
		bar i = i + 1

	module M where
		include T Int String
		foo = show . bar

Here, the body of a module is always a trait, and the above is 
equivalent to:

	trait T a b = tra
		foo :: a -> b

		bar :: Int -> Int
		bar i = i + 1

	module M = new tra
		include T Int String
		foo = show . bar

which makes it more explicit that conversion of the contents to actual 
code (ie linking, allocation/identity/initialization of CAFs and foreign 
objects, generativity of data types etc) happens only in the module decl.

The great thing about making instantiation explicit is that traits are 
pure functional values and so can easily be combined with no side 
effects, whereas modules may in general contain mutable state eg to 
interface with some external C library and thus are 
objects-with-identity. Thus the separation solves the whole problem of 
applicative vs generative functors in ML, as well as solving the problem 
of mutually recursive structures (functors become a redundant concept 
because you can always start with including a trait then overriding some 
decls with more instantiated (here used in a different sense) decls 
and/or traits).

Last but not least, a trait could also be used similarly to a signature, 
except that definitions in the trait can be considered to be default 
implementations. Therefore in this scenario the body of a class or 
instance decl is just a trait and so could be composed using other 
traits etc. (Each instance decl instantiates the trait given in the body 
merged with the trait given in the class leading to a possibly 
parameterized module.)

Anyway these are some of the directions I am currently working on for my 
new language which is a strict version of Haskell/ML but where explicit 
type annotations drive name resolution rather than explicit namespace 
annotations driving type inference.

Related work for the above version of traits/mixins includes the Scala 
language and the approach described in "Evolving Software with 
Extensible Modules" by Matthias Zenger.

Regards, Brian.

--

www.metamilk.com