[Haskell-cafe] Re: Rewriting a famous library and using the same name: pros and cons

Wed Jun 9 22:38:22 EDT 2010

There's a big range of issues here, and to be honest I'm not sure if our 
ability to distinguished between them is helped by the title of this 
thread, which somewhat begs the question. That is to say, it isn't clear 
to me that calling the proposed changes to the fgl "rewriting a library" 
is necessarily accurate -- it seems more the case that these are 
incremental improvements of a library that require breaking API changes.

So on the concrete issue at hand, I'd be for the new fgl version being 
developed under some new provisional name, and taking pains to provide a 
compatibility layer where possible. Then, after we see what the changes 
really are, coming to some informed decision on whether to rebrand it as 
the real fgl version 6. If so, the old stable fgl can be put up on 
hackage as fgl98, which lets packages which want to stick with it do so 
while avoiding any possibility of the dread diamond dependency.

More broadly, we have to accept that breaking API changes are an 
irritating but necessary fact of life. As much as the parsec and 
quickcheck issues have caused some modest pain, there's been equal 
hassle from things like the strictness behavior of binary, or even the 
type change in tagsoup. Splitting out Category from Arrow caused me 
probably the most hassle. In retrospect it was the right thing to do. 
But how it was done was particularly abrupt and painful. Exceptions got 
it right in pretty much every respect, but still migration necessarily 
took some work. We want our packages to grow, including our core 
packages. Otherwise we get fragmentation and duplicated effort. When we 
want to grow, but don't know exactly how, then we get experimentation. 
But experimentation without some organization can lead to the wrong sort 
of fragmentation -- like the mtl mess, whose resolution now thankfully 
seems to be in hand.

Some lessons I think we can learn from the past about changes to 
widely-used stable APIs:
* Clear and documented upgrade paths.
* Preferably a compat layer (Exceptions and Parsec both did a killer job 
with this).
* No, or demonstrably minimal performance regressions.
* Strong release notes and other documentation, either duplicating or 
supplementing what existed prior.
* For particularly long-lived stable APIs, forking off a 
maintenance-mode-only version may make good sense, especially when the 
subset of language extensions used differs significantly.

Some lessons to us API consumers who write somewhat-less-core packages:
* Upper version bounds.
* If at all possible, don't move to the fancy new thing until the fancy 
new thing is fully baked, and on track to widespread adoption. (early 
adopters of new mtl implementations, I'm looking at you :-))
* If at all possible, try to stay compatible with at least the prior GHC 
version as well as the current.
* Don't pull in big packages for small reasons unless really necessary 
-- minor duplication of trivial code is often the lesser evil.

Some lessons for folks exploring new variants:
* Don't step on already-used module names.
* Make clear whether you intend a package as a demonstration/proof of 
concept or are fully committed to significant development and support.

Some technical issues that will help as time goes on (many already 
underway):
* Depreciation of packages on hackage/redirects. (Makes it easier to 
establish upgrade / migration / transition paths).
* Tree organization of packages on hackage. (Reduces the noise generated 
by lots of small packages, and so encourages splitting things out).
* Wikilike documentation features on hackage (lets users contribute and 
share upgrade paths, etc. more directly and simply -- hopefully will 
help with community documentation of packages in general).
* The "local usage" annotation for cabal files to help avoid the dread 
diamond dependency.
* The package version policy checker.
* A DSL to describe transforms of Haskell programs for at least simple 
API migrations. Yes, this is a bit more "out there" but it's a great 
space to explore. The upside is not only better tools to help authors 
migrate their code, but a strong representation of what exactly the API 
changes are. So even if the spec language describes things that can't be 
applied automatically, it can still formalize what authors need to do. A 
standard format for an API change log as a hackage plugin would be a 
good start to this.

The above lists are pretty incomplete, but hopefully they're useful. 
Thanks to Don for kicking this discussion off.

Cheers,
Sterl.