Module Holism (was RE: exposed packages and cabal depends)

Mon Apr 11 20:27:45 EDT 2005

On Mon, 11 Apr 2005, Simon Marlow wrote:
>> The problem is, we don't want to import two modules, only to
>> discover that somewhere in their dependencies they each use the
>> same module name to refer to conflicting module implementations.
>
> This is the problem that the overlap restriction leads to, yes.

No, you have this problem even with atomic modules. It is a result 
simply of not allowing two modules to share the same name in the same 
program and has nothing to do with package overlap.

e.g. Suppose module Foo imports module Goo' and module 
Bar imports module Goo'' but that modules Goo' and Goo'' both claim to 
be named "Module Goo".  Foo and Bar both work on their own but you 
can't use both in the same program.  That's a big problem!

> Remember, when you're importing library modules, your dependencies are
> on *packages*, not modules.

Notice that in my example, it doesn't matter whether these modules are 
all in the same package or whether they are all in different packages 
or anything in between.

> It is
> mitigated by the fact that the granularity of dependencies is made
> coarser by the packaging system, so we believe it will rarely be a
> problem in practice.

In practice, it is all too easy for two diffferent packages or modules 
to assume two different versions of e.g. Network.HTTP.

>> Therefore, we really want to say that no two modules we might want to
>> import into our programs (either directly or indirectly) should share
>> the same name.  And, in particular, we don't want a packaging or
>> versioning system that encourages it!
>
> No, you've drawn a bogus conclusion again.  We most definitely want the
> ability to choose between multiple instances of a particular module in
> programs.

My point is that the choice of instance should be made at 
compile/build/run times and not at packaging time.

Allowing packages/modules to choose implementations increases the 
risk of the sort of conflict described above.

> For example, if I have two versions of a package installed,
> say P-1 and P-2, I want to be able to compile my old code that depends
> on P-1 while still being able to write new code against P-2.  And I want
> to be able to use other packages that still depend, for the time being,
> on P-1.  When P-3 comes out, I don't want to be forced to uninstall P-1
> and P-2 before I can use it.

Now what happens when you want to use one package that depends on P-1 
and another that depends on P-2 at the same time?

My claim is that you can solve this only if a package that depends on 
P-1 can also work with P-2 and one that depends on P-2 can also work 
with P-3.

The only reason a build should fail is that you don't have a 
sufficiently recent version of some module.

[clarifying explanation of build-depends problems from prior mail]

When you say that package 'A' "build-depends" on packages 'B' and 'C', 
you are implicitly selecting a specific set of implementations for the 
module names used in package 'A'.  However if both 'B' and 'C' define 
and use module M then you are going to run into trouble.

But now lets suppose that 'B' uses module M and 'C' doesn't.  But that 
package 'C' build-depends on package 'D' and package 'D' build-depends 
on package 'E'.  Now suppose that a new version of 'E' is released 
that defines and uses module M.  Now you have real trouble!

Either way, you need some way to specify which module is authoritative 
implementation of a particular module name for your entire build and 
not per package!

In other words, the mapping of module names to implementations is not 
specific to any particular package, it is explicitly global to the 
build of the resulting program and implicitly global to the community 
of module/package authors.

For the same reason, you don't want "-hide-all-packages."  You want 
"-hide-local-variance" from the consensus mapping of module names to 
implementations.

>> What we want is a common system for mapping module names to
>> module implementations.
>>
>> * a protocol for resolving module-names to implementations at
>>   various haskell module name servers (hackage?)
>
> Go ahead, invent a protocol!

I offered two strawman versions of this protocol in my last mail in 
the "stop untracked dependencies" thread:

   Strawman protocol 1: Define a new DNS record, URL, and define some
   translation of module names into domain names.

   Strawman protocol 2: Use HTTP/HTTPS and define a query syntax such
   as "GET ResolverURL/moduleName HTTP/1.0" and use 30x headers for
   redirection to the appropriate server.

I like the later because we can then reuse WebDAV versioning 
semantics, but think that there is little point in fleshing it out 
further unless we agree on the need.

>> And we need to constraint the versioning system to require that
>> every new version of a module must fulfill the contracts of the
>> prior versions so no existing dependency is broken by any changes.
>
> That's way too restrictive.  We'd never be able to remove anything.

Its not too restrictive.  It is correct.  You said that module names 
are intended to describe functionality.  If you want to change the 
contract, you need either to define a new module name or accept that 
programs that depend on a sufficiently old version of the current 
module will simply break.  What does 'deprecated' mean?

> When we start using shared libraries, even binaries will break if you
> upgrade shared libraries in place.  That's why we must have versioning,
> and the ability to have multiple versions of a package installed.

Again, I am not opposed to having multiple versions of a package 
available for programs to use.  My point is that the choice of version 
must occur at build/run time and not at package time.

-Alex-

______________________________________________________________
S. Alexander Jacobson tel:917-770-6565 http://alexjacobson.com