[Haskell-cafe] The container problem

Fri Sep 26 17:12:15 EDT 2008

On Fri, 2008-09-26 at 19:15 +0100, Andrew Coppin wrote:
> Take a look around you. Haskell provides several sorts of container. We 
> have:
> 
>   Data.List
>   Data.Set
>   Data.Map
>   Data.Hashtable
>   Data.ByteString
>   Data.Sequence
>   Data.Array
>   Data.Tree
>   Data.IntSet
>   Data.IntMap
>   ...
> 
> In other words, we have *a lot* of different data containers. And yet, 
> each one provides its own unique API.
> 
> To anybody used to OO languages, that sounds pretty crazy. In something 
> like Java or Smalltalk or Eiffel, you have an abstract class that 
> represents "container", and maybe a seperate one for "ordered 
> container", and then concrete subclasses for each kind of container. 
> Each one may add a few unique methods of its own, but basically all the 
> containers have a uniform API, and you can write functions that work for 
> any arbitrary [ordered] container.
> 
> In Haskell, you can't do this. Why is that?

Obviously you certainly can.  That there isn't a "standard" form is in
my opinion not really historical or limitations of H98, though there are
certainly some aspect of those.  You can do a not horrible job in just
H98.  You can do a much better job using common extensions (and yes, in
particular MPTCs and fundeps.)  As Albert Lai alluded to, you can use
Edison if you want this, right now, today.  I think the problem is again
that the Perfect Interface hasn't been found and for several reasons
which I'll enumerate, there is not a pressing desire for a reasonable
compromise.

One aspect of it is a bit of a You Aren't Going To Need It.
Particularly for applications, there is usually very little gain in
practice and for Haskell many of the container libraries have identical
interface subsets so that you do end up being able to change
implementation by changing a single import.  This is further reinforced
by there being a single obvious choice for common data structures.
Admittedly, it would still be nice to be more explicit about this and to
program to interfaces, especially for library code.

Another reason is that often you can use an intermediate data structure,
especially lists, for operations.  Lists are the
iterators/IEnumerable/generators of Haskell.  So if I just need to
traverse a data structure, I can just write an interface that accepts a
list.

Of course, since Haskell isn't Java, I'm not subject to the choice of
interfaces the data structure implementer decided to implement or not.
When this happens in Java the "standard" solution is to use an adapter
class.  In Haskell, I can just write the instance.  In particular, I can
decide on whatever interface I need, write my code to that interface and
just instantiate it for the relevant data types and users can
instantiate it for their data types.  If you want an OrderableCollection
class, you can simply write one today.  You don't need to ask anyone's
permission or coordinate with anyone.

There are some general reasons too.  Typically asymptotic complexity
guarantees are considered part of the interface for a data structure.
If you do this you either end up with loose constraints that aren't very
useful or tight ones that provide nice guarantees but exclude all but a
few data structures.  This leads back to the first reason, YAGNI.  You
often have particular properties that you want and thus end up with
particular data structures.  Admittedly, you can still require tight
complexity constraints and if someone wants to violate them, the
performance problems are their fault but maybe convenience outweighs
performance in that case.  Usually, though, there are convenient
conversions between the types.

Finally, there -are- several more or less standard classes that capture
different general operations on data structures (though there certainly
could be more.) They, of course, have different names and different
interfaces and different factorings from imperative equivalents.  We
have Functor, Applicative, Monad, MonadPlus, Monoid, Foldable,
Traversable, IArray, MArray and others.  Notice how the ridiculous
proliferation of array types in Haskell has pressed the issue and led to
the creation of IArray and MArray.

Ultimately, it would still be beneficial to have some more standard
interfaces for this sort of thing.  There just hasn't been enough of a
fire under the community's rear.  This again suggests that YAGNI.