[Haskell-cafe] The container problem

Fri Sep 26 14:15:05 EDT 2008

Take a look around you. Haskell provides several sorts of container. We 
have:

  Data.List
  Data.Set
  Data.Map
  Data.Hashtable
  Data.ByteString
  Data.Sequence
  Data.Array
  Data.Tree
  Data.IntSet
  Data.IntMap
  ...

In other words, we have *a lot* of different data containers. And yet, 
each one provides its own unique API.

To anybody used to OO languages, that sounds pretty crazy. In something 
like Java or Smalltalk or Eiffel, you have an abstract class that 
represents "container", and maybe a seperate one for "ordered 
container", and then concrete subclasses for each kind of container. 
Each one may add a few unique methods of its own, but basically all the 
containers have a uniform API, and you can write functions that work for 
any arbitrary [ordered] container.

In Haskell, you can't do this. Why is that?

To me, it seems that there are two sticking points:

1. Historical reasons.

2. The Haskell '98 type system.

(1) is obviously solvable. (2) is harder.

Some containers can contain *any* type of data. Haskell permits 
parametric polymorphism, so this is no problem:

  Data.List.map :: (a -> b) -> [a] -> [b]

Other containers only support *one* type of data:

  Data.ByteString.Char8.map :: (Char -> Char) -> ByteString -> ByteString

The type has a different kind, and the function parameter's type is more 
constrained. Yet still this poses no problem.

However... now try writing a class that both of these functions could be 
methods of. Good luck with that, by the way...

This is AFAIK also the reason why, e.g., Set is *not* an instance of 
Monad; you can't write a valid instance. The type checker won't have it.

To ears accustomed to the OO way, all this makes it sound like Haskell's 
type system sucks. (Which is rich, because in half the OO languages, you 
can't write a type-safe container that works for arbitrary element types 
in the first place! Haskell is a Big Win here.)

If I understand this correctly, to solve this problem you need either 
Functional Dependencies or Associated Types. Is that correct?

I also gather that "FDs have problems" - although I have no idea what 
those problems are. Everybody's hoping that ATs will fix this, but ATs 
are still kinda new. (Are they even fully implemented in GHC yet?)

Can anybody correct/expand on this state of affires? I just want to make 
sure I understand our position correctly...