[Haskell-cafe] What's in a name?
wren ng thornton
wren at freegeek.org
Fri Aug 15 19:24:39 EDT 2008
Sean Leather wrote:
> That doesn't work if you want to use two packages that have modules
> sharing the same hierarchical name, and this is a definite possibility
> given my statements above. Of course, having the ability to import
> modules from specific packages [1] would fix this, but only as long as
> the package names are also unique.
>
> Personally, I like the Java package naming scheme recommendation. It
> scales better, because each package name uses the organization or URI to
> uniquely identify a subset.
Personally, I have major qualms with the Java package naming scheme. In
particular, using domain names sets the barrier to entry much too high
for casual developers (e.g. most of the Haskell user base). Yes, DNs are
cheap and plentiful, but this basically requires a lifetime lease of the
DN in question and the migration path is covered in brambles. The
alternative is simply to lie and make up a DN, in which case this
degenerates into the exact same resource quandary as doing nothing (but
with high overhead in guilt or registration paperwork).
The way CPAN is set up is much more egalitarian, though mired in a bit
much administrivia for casual developers.
The orthogonality of package names to module names is something I
consider very much a feature, and not a bug. The only other packaging
system I've seen to offer this is Monticello for Squeak/SmallTalk, and
I've missed it ever since. By making packages orthogonal that allows for
developers to create drop-in replacement packages that offer the same
module services as another package, without needing to alter any code
that uses the old package (save relinking/recompiling). This is the same
advantage as allowing different modules to offer the same functions
(e.g. having Data.ByteString as a drop-in for the [ ]-portions of the
Prelude), but lifted up to the next tier.
The question then is two-fold. First, is the question of how to minimize
the problems of ambiguity and how to resolve conflicts when they arise.
Second, is the question of whether this is really the job of Haskell,
the language itself, or whether it is more appropriately dealt with by
the build tools, e.g. Cabal. I'll deal a bit more with the latter question.
(( For readers who don't want to slog through the rest of this post, the
conclusion is that I feel an agile packaging system is an imperative, as
discussed above. The trick is finding a way to be agile without creating
a maintenance and conflict nightmare. But given the imperative: baby,
bathwater, etc. ))
I do like your (Sean Leather's) patch for being able to specify package
names in source code, though I'd think something like Core's
"package:module.module.module" syntax would be better if it gets adopted
into Haskell'. I do however think that specifying the package should be
optional, with conflicts to be resolved by commandline flags or via
Cabal. Without this we loose the ability to have drop-in replacement
packages, which in turn greatly complicates migration paths. The
community is still young, but forks do happen and we would do best to
allow for forwards compatibility whenever possible.
This approach also gives the same sort of split control as the various
{-# FOO #-} pragma give. As an ad-hoc GHC solution, adding a new PACKAGE
pragma would be better than just using a string there. In theory we can
already do this with OPTIONS_GHC, though that pragma seems not to
respect the -package option. Of course, the new pragma should be
position restricted to make it obvious which imports it applies to,
rather than assuming to apply to the whole file (i.e. by putting it
where you put the strings).
One issue with this and Java's scheme of just concatenating package
names onto module names is that they offer no provisions for specifying
version restrictions. For a PACKAGE pragma we could design it deal with
this too, since the modules themselves don't have versions. Of course
this starts getting into hairy issues which Cabal was designed to
resolve, so porting it back to the compiler seems misguided.
Perhaps a simpler option, for a Haskell' world, would be to give modules
versions and give the import syntax some way of specifying the version
to use. Sticking with something like the current packaging system,
packages would just specify the module versions they provide, and those
versions need not be related to the version of the package itself. This
has the benefit of being able to release and maintain legacy packages,
once the world has forked or moved on to a new major version.
As an addendum to this, it could be helpful if "package" names (i.e.
alphanumeric sequences) were a part of the module version specification.
This way a package hfoo-legacy could continue to provide the hfoo-1.24
versions of modules, and it would be the package that forked off rather
than forcing the new hfoo package to rename itself to break ties from
the legacy code.
Another ability that the package/module system lacks right now is a good
way for annotating deprecations. Java has this, but again they do it
wrong. Whenever something is specified as deprecated it needs to provide
a migration path to non-deprecated code. Simply saying "you fail" is an
insufficient error message.
This proposal doesn't solve the resource allocation issue. That issue
will always be around so long as we assume nodes in the dependency graph
have unique names. And that assumption is a very useful expedient so
we're unlikely to abandon it any time soon (though maybe we should). But
I think giving modules explicit name-version annotations is a better
path forward than adding more bureaucracy to the module hierarchy. I
think the suggested best practices for naming modules should be refined
since they're starting to get out of date with all the code on Hackage.
In particular there's a lot of conflict about (1) where to put new
interesting Num data types (Data.Number.*, Data.*, Numeric.*, ...?); (2)
where to put testing and diagnostic tools (Debug.*, Test.*); and (3)
where to put modules for the core operation of application projects. But
beyond providing better guidance, I don't think we should have a central
body issuing leases for the module namespace. Especially because we
already have a packaging system which is orthogonal to the module system.
One of the reasons I love Haskell so much is because it is so extremely
agile. I've been an active open-source developer for many years, and of
all the languages I've used Haskell has by far the easiest system for
communal public release of code. Perl's community is also very nice
though it's gotten to be large enough that they do really need the
bureaucracy they have. All the same it means less of my Perl code has
made it into the wild than I would have liked. As for C and Java, the
only stuff of mine that's managed to eek out into the public are whole
projects, never any of the many small building blocks it takes to make
something run and to make people able to bang out a program in a few
hours because all the dirty work is already done and available in a
large public repository.
--
Live well,
~wren
More information about the Haskell-Cafe
mailing list