[Haskell-cafe] Correspondence between libraries and modules
wren ng thornton
wren at freegeek.org
Mon Apr 23 06:03:51 CEST 2012
On 4/22/12 6:30 PM, Alvaro Gutierrez wrote:
> On Sun, Apr 22, 2012 at 4:45 PM, Brandon Allbery<allbery.b at gmail.com>wrote:
>> One reason: modules serve multiple purposes; one of these is namespacing,
>> and in the case of interfaces to foreign libraries that may force a
>> division that would otherwise not exist.
>
> Interesting. Could you elaborate on what the other purposes are, and
> perhaps point to an instance of the foreign library case?
The main purpose of namespacing (IMO) is to separate concerns and make
it easier to figure out how a project fits together. The primary goal of
modules is to resolve namespacing issues.
Consider one of my own libraries (chosen randomly via Safari's url
autocompletion):
http://hackage.haskell.org/package/bytestring-lexing
When I inherited this package there were the Data.ByteString.Lex.Double
and Data.ByteString.Lex.Lazy.Double modules, which were separated
because they provide the same API but for strict vs lazy ByteStrings.
Both of those modules are concerned with lexing floating point numbers.
I inherited the package because I wanted to publicize some code I had
for lexing integers in various formats. Since that's quite a different
task than lexing floating point numbers, I put it in its own module:
Data.ByteString.Lex.Integral.
When dealing with FFI code, because of the impedance mismatch between
Haskell and imperative languages like C, it's clear that there's going
to be some massaging of the API beyond simply declaring FFI calls. As
such, clearly we'd like to have separate modules for doing the low-level
binding vs presenting a high-level API. Moreover, depending on what
you're interfacing with, you may be forced to have multiple low-level
modules. For example, if you use Google protocol buffers via the hprotoc
package, then it will generate a separate module for each buffer type.
That's fine, but usually it's not something you want to foist on your users.
On the other hand, the main purpose of packages or libraries is as unit
of distribution, code reuse, and separate compilation. Even with the
Haskell culture of making small libraries, most worthwhile units of
distribution/reuse/compilation tend to be larger than a single
namespace/concern. Thus, it makes sense to have more than one module per
package, because otherwise we'd need some higher level mechanism in
order to manage the collections of package-modules which should be
considered a single unit (i.e., clients will almost always want the
whole bunch of them).
However, centralization is prone to bottlenecks and systemic failure. As
such, while it would be nice to ensure that a given module is provided
by only one package, there is no mechanism in place to enforce this
(except at compile time for the code that links the conflicting modules
together). With few exceptions, it's considered bad form to knowingly
use the same module name as is being used by another package. In part,
it's bad form because egos are involved; but it's also bad form because
there's poor technical support for resolving namespace collisions for
module names. In GHC you can use -XPackageImports, which is workable but
conflates issues of code with issues of provenance, which the Haskell
Report intentionally keeps separate. However, until better technical
support is implemented (not just for GHC, but also jhc, UHC,...) it's
best to follow social practice.
> I'm confused as to how type families vs. fundeps play a role here -- as far
> as I can tell both are compiler extensions that do not provide modules.
Both TFs (or rather associated types) and fundeps aim to solve the same
problem. Namely: when using multi-parameter type classes, it is often
desirable to declare that one parameter is wholly defined by other
parameters, either for semantic reasons or (more often) to help type
inference. Since they both aim to solve the same problem, this raises a
new problem: for some given type class, do I implement it with TF/ATs or
with fundeps?
Some people figured to solve the new issue by implementing it both ways
in separate packages, but reusing the same module names. (Witness for
example mtl-2 aka monads-fd, vs monads-tf.) In practice, that didn't
work out so well. Part of the reason for failure is that although
fundeps and TF/ATs are formally equivalent in theory, in practice the
implementation of TF/ATs has(had?) been missing some necessary
machinery, and consequentially the TF/AT versions were not as powerful
as the original fundep versions. Though the butterfly dependency issues
certainly didn't help.
> I'm interested to see examples where two or more well-known yet unrelated
> modules clash under the same name; I can't imagine them coexisting in
> public very long -- wouldn't the confusion among users (e.g. when looking
> for documentation) be enough to either reconcile the modules or change one
> of the names?
That's not much of a problem in practice. There are lots of different
books with a Chapter 1, but rarely is there any confusion about which
one is meant. The same is true of module names in packages.
--
Live well,
~wren
More information about the Haskell-Cafe
mailing list