[Haskell-cafe] When to use fancy types [Re: NumberTheory library]

Thu May 12 22:39:34 EDT 2005

G'day all.

Quoting Jan-Willem Maessen <jmaessen at alum.mit.edu>:

> Why not use a function?
>
> What's wrong with a function?
>
> There no need to go leaping for a multiparameter type class with a
> functional dependency!  Just use a function.
>
> [With apologies to John Cleese]

A reasonable question, and one that deserves a better answer than
"I prefer it".

Some background:

I've been reading a lot of category theory stuff lately.

I just finished reading "Generative Programming".  (Well, reading
half of it, skimming the rest.  Anyone who has read the book knows
exactly what I mean.)

I was probably unduly influenced by both while writing the library.

> * We are introducing types whose only purpose is to stand for a single
> function.

You mean like the Eq typeclass?

(Technically, Eq has two functions in it, but one can be trivially
obtained from the other; indeed (/=) could have just as easily been
defined as an INLINEd function outside the typeclass, with practically no
difference.)

> * Declaring a property used to take one line---a function declaration.
> Now it takes a minimum of three:
>    - a bogus type declaration for Primes
>    - an instance declaration
>    - the actual declaration for "is".

On the other hand, under my scheme, passing a property to a function
requires one modification (to the type signature) which isn't even
mandatory (because type signatures usually aren't mandatory).  Under
the traditional scheme, you have to thread the property through code,
passing it as a parameter.  More on this later.

> * The use of multiparameter type classes leads to relatively more
> obscure error messages and clunky type signatures with bogus type
> parameters.

A reasonable objection.

> * It's not Haskell 98,

Another reasonable objection.

> and an acceptable (or even preferable) solution exists which is.

If your entire purpose is to expose functionality, then you're right.
Simply exposing a single function is acceptable.

However, you don't learn anything that way.  My purpose is a bit grander.
I'm taking the opportunity to try to understand what the Haskell libraries
of the future might look like, starting with a simple example.

See below for details.

> * There's no equivalent of lambda here---we can't write anonymous
> properties.  I list this last, as some may consider it a disadvantage.

Eq has the same "disadvantage".

> When using fancy type-system footwork, I suggest the following
> questions (which I ask myself, too):
>
> * Can I replace one or more parameters to my methods with a
> suitably-typed call to "undefined"?

To which the answer, in this case, is: For this example, yes.  But not
always...

> * What would happen if I used a record instead of a class declaration?
> The result might look pretty similar on the page, but it might be H98.

When producing libraries that are designed for generic programming, then
more often than not, the "record" solution is at best H98 + higher ranked
types, which isn't necessarily an improvement over H98 + MPTCs + fundeps.

> If the class has 0 or 1 method, you don't even need a record---you can
> pass things directly.

Again, this applies to Eq.

There is a reason why we have typeclasses in Haskell.  Well, several
reasons actually.  There's the historical reason (it made ad hoc
overloading less ad hoc, as one paper put it), and the collection of
mini-reasons developed in retrospect.

Taking Eq as the example:

1. "Equality testing" is a pattern that keeps turning up in real code.
The category theory philosophy (as well as the "design pattern"
philosophy) is that if a pattern keeps turning up, it's interesting in
its own right.  In particular, it makes sense to give it a name so you
can talk about it independently of any specific objects that exhibit
the pattern.

"Eq" is not merely a function of type a -> a -> Bool.  It's a concept
with semantics.  It must be an equivalence relation, and it also must
mean semantic equality.  Functions that respect semantic equality have
the property that x == y implies f x == f y.

(No, none of these restrictions are in the language report.  But I'd be
annoyed if someone defined a version of Eq that didn't have these
properties.)

2. The generic programming philosophy is that generic algorithms should
only require precisely those concepts that they need in order to do need
do their job.  So, for example, isPrefixOf :: (Eq a) -> [a] -> [a] -> Bool
doesn't need to know that a is Hashable.

This goes double for generic algorithms which need to know about more than
one concept (e.g. Eq and Show).

Now let's consider these two classes, one from the number theory library,
and one inspired by Bo Herlin's code:

    -- |Class for sequences where the position of an element can be
    -- identified.
    class RankableSequence seq a | seq -> a where
        -- |Find the rank of an element.
        rank :: seq -> a -> Maybe Integer

    -- |Class for descending numeric sequences.
    class DescendingSequence seq a | seq -> a where
        -- |Enumerate the sequence from the nth element down.
        enumDown :: seq -> Integer -> [a]

RankableSequence and DescendingSequence are related but essentially
orthogonal properties.

There are some types of sequence where it's very hard to work out where
an element falls in the sequence.  For example, finding out that 65537 is
a prime is straightforward.  Finding out that it's the 6543rd prime is
difficult by comparison, and it only gets harder the larger the primes get.
So the "rank" function only makes sense for some kinds of sequence.

Similarly, there are some types of sequence where it's harder to produce
the elements in descending order than it is in ascending order.  (Primes
are another example.)

I can envisage semi-generic functions which want these properties together.
It makes sense to locate an element in the sequence, and then descend from
it.  If I represented the properties as single functions, I'd have to pass
them both explicitly.  It gets worse if I have more than two interesting
properties, or properties that depend on each other (e.g. Ord depends on Eq,
so as soon as I discover that I need Ord, I don't have to pass Eq explicitly
any more).

Using the typeclass scheme, all you have to do is give the "sequence" a
name, and pass only that.  The type system takes care of everything else.

I think part of the problem here is that we have different goals.  You're
thinking that you might need to write some code which needs to test whether
numbers are prime or not some day, and you'd like to use a library such as
mine, but it looks like overkill.  And if that's all you want to do, you're
right.  I'm looking at a broader picture (though not the broadest possible),
at generic algorithms which can work on any type of sequence and for any
type of property.

The objections sound to me like someone saying: "What would you need a
state monad transformer for?  All you need to do is thread an extra piece
of data through your code!"

In some sense, they would be right.  That really is all you _need_ to do.
And it's often the right thing to do, if all you need to do is get a
certain job done.  But would you be without monad transformers now that
you know their advantages?

What are the advantages of doing it this way?  I don't know.  Maybe they
are significant, maybe they are nonexistent.  Either way, it's going to be
interesting to find out.

Cheers,
Andrew Bromage