Proposal for a Standard of Abstract Collections (with Reference Implementation)

Robert Will robertw at
Fri Mar 19 08:56:36 EST 2004

hi all,

Curiously I have finished the promised draft standard for the Abstract
Collections just by a time, where many are busily working on DData.
It's there:

I'm making this discussion hot now, since my Dessy library implements
the proposed collection standard and features all of the concrete
structures that are also in DData, and some more, especially Deques
and Democratic Sequences (explanation: ).

But I also want to have your attention for another aspect of the
problem: the introduction of abstract data structures makes the
programming language much more useful and suggests some
generalisations of the manifest syntax.  Thus my proposal does not
only contain

 - a formal specification for a collection hierarchy (based on
   Abstract Algebraic Data Structures (TM)), but also

 - generalisations of all the standard functions from Prelude and
   Data.List, and

 - a formal specification of Haskell extensions, that make abstract
   structures as easy to use as lists -- but with much more

Incidentally the current discussion around DData already shows signs
of what I predict in the standard's rationale: when concrete data
structures are extended to become more general, the number of
functions increases so fast, that a general standard isn't more
complicated any more, but simpler actually.  For example, you
discussed whether Sets and Maps should be converted from and to
Sequences or the built-in "lists".  Well, the best answer is to
convert from and two an type classes of which "lists" (if one uses
them at all) are an instance.  Exactly this is done by the function
"convert" proposed in the standard:

> convert :: (Collection coll a, Collection coll' a) =>
> 	     coll a -> coll' a
> convert = fold empty single (<+>)

(The function can be specialised in the same way as 'fromIntegral' on
the numeric types, so we don't loose efficiency.)

Abstraction is more: the proposed standard helps to use a lot of
different data structure implementations together, using one interface
that has been designed with the user first in mind, not the
implementation.  But this abstraction also allows an entirely new
programming style, that is especially useful when TEACHING
programming.  Data Structures are deeply intertwined with programming
as such, that the proposed standard gives Haskell something that NO
other language does have today.  (Perhaps Eiffel comes closest, but
EiffelBase is not standardised.)  (Remind me to write "A Gentle
Introduction into Haskell with Abstract Collections".)

It has been mentioned that an advantage is of functional programming
is the ability to easily _compose_ programs.  But for this composition
data structures are neccessary, without a standard for data structures
functional languages loose one of their biggest advantages.  In
specifications Sets are the most important data structure, but in
Haskell today, the so-called "lists" are the linguaga franca and that
reminds me too much of the stacks in Forth (remember "lists" are
actually functional stacks) or dynamic arrays in imperative languages:
it's universal, but unflexible.  (I'll have to write a paper on
"Liberating programming from the McCarthy-style".)

Since "a standard of Abstract Collections" has already been mentioned
by some, I expect that there are high expectations regarding my
proposal, and perhaps its many disadvantages (all the big ones
mentioned in the documentation) will discourage the reader.  On the
other hand, there are so many advantages that one can hardly neglect
that something big is going on.  So the least thing that someone
interested in the future of (functional) programming should is to
download and try out the Collections, or to read a little bit in the
design documents.

Things to do:

 - Build examples for the use of the new Collections.
   (From now on anyone can port his code to get rid of "lists".)

 - Develop the Haskell Documentation Standard and apply it to
   Collections.  (Specifications are exported and used for Tests.)

 - Using those specifications make a complete test framework.
   (Complete meaning every function is tested with the complete
   specification, so no Bugs can persist.)

Given all that, it is trivial to add more implementations of the Abstract
Collections.  (Just added two concrete structures Yesterday, although
Dessy has already much more functionality than DData, but only half its
code size.  (Or one third if you don't count the type classes and
specifications.)  I also think that Dessy has more readable code, but I
don't know how important that is for anyone.)

Have a good time,

More information about the Libraries mailing list