Abstract Collections

Fri Mar 26 10:31:57 EST 2004

On Tue, 23 Mar 2004, Daan Leijen wrote:

> It would help me if you could use Haddock to show the interface in html
> format (documentation is not necessary for now, just the type signatures
> would be great).

I can only do that later, so if noone wants to jump in, you have to wait
some weaks.  (Alternatively, use 'grep ::' on the code.)

> Could you maybe also tell something about the relation with Edison? This
> framework seemed to have the same goals as Dessy?  Why is it not suitable
> for your purposes?

Since this comparison is rather spread out in my design documents, here is
a summary:

First of all, my Collections are a more model-centered while Edison is
more algorithm-centered.  Okasaki's (good!) choice to have all concrete
structures implement all operations of a class, does already make this
difference small, but still Edison centers around algorithms (and data
structures for use in algorithms), while I center on specifications and
data structures for use in day-to-day non-algorithmic programming.  A
discussion of this is found in the material for (imperative) Dessy:
http://www.stud.tu-ilmenau.de/~robertw/dessy

Next, I'm using a brand-new algebraic approach with inheritance as
refinement to specify the collections and to build up the hierarchy.
Thereby many things become simpler (and more different from the
List-centered traditional functions).  This is explained at length in
http://www.stud.tu-ilmenau.de/~robertw/dessy/fun/principles.htm

As a consequence of that approach, functions that work on all of the
Collections get a clear (and useful) semantics.  Programmer's won't need
to learn Sequence.fold, Map.fold, Set.fold, and so on -- they just learn
to fold (along the right lines).  This is what makes the approach
challenge the traditional "list" data type and with it large parts of
current programming (formal derivation) methodology.

Lastly, the differences in more concrete design decisions, most notably:
 - I have no unordered structures.
 - I always view (==) as equality.
http://www.stud.tu-ilmenau.de/~robertw/dessy/fun/principles.htm#not_here

> Please be aware that the goal of DData is just to provide a range of
> generally useful *concrete* data structures.  Currently, JP Bernardy is
> doing an excellent job of publicly discussing the design and naming
> schemes to make DData part of the hierarchical libraries.

Yes, of course, DData is a good (even indespensable) thing.  However, the
long-term goal of the abstract collections is to make all such autonomous
libraries obsolete, so there's a natural conflict.  We have to manage a
smooth transition.

I think some of JP's changes to DData are already very good steps in that
transition, notably the view of (==) as equality and the consequent
removal of bias.  This is another confirmation of my prejudice that people
in the FP community are much more ready to accept change for The Right
Thing than "imperative folks".  That's one of the reasons that Functional
Dessy got out so fast, while the imperative version is older and has still
not seen the light of the day.  (Another reason is of course, that FP is
simpler, we have only algebraic Collections, no mutable ones.  Imperative
Dessy will be the first library that has both...)

> Are there any important design issues/restrictions that have to be
> fulfilled to make DData implement some of the collections?

Simple answer (from my advertising dept.): "To add a new data structure
one just needs to implement two (in numbers: <b>2</b>) functions and the
structure will inherit (about) 30 functions and hundreds of other
polymorhic functions can be used with it."

But of course you don't just want to implement 'split' and 'unsplit' since
then most of DData's (optimised) code will not be used.  So there are some
real compromises to make (hard thinking required).  And since Dessy
already provides all of DData's algorithms (re-implemented, sharing much
code with other algorithms) with a Collections-interface, I can't see a
benefit of such a port at time being.

I think at the moment the focus should be on collecting more experience
with the Collections, setting up Design by Contract and a (performance and
correctness) test framework.  After that we can think about
implementations.  However, we can already see that the "markets" of DData
and Dessy are clearly distinct: one optimised for performance (at the
expense of code duplication) and the other as a tool-box for algorithms
(with code that longs to be ready by others, e.g. students).  The
synergetic effect will not only be provided by a common interface, but
also by exchange of algorithmic ideas and "practical issues of
optimisation".  Brave new world.  :-)

Robert