Types vs. Classes

Fri May 19 22:49:58 EDT 2006

On May 19, 2006, at 8:31 PM, Ashley Yakeley wrote:

> Occasionally in library proposals one comes across classes of this  
> form:
>
>   class Thingy a where
>       foo :: a -> this
>       bar :: a -> that
>       spong :: a -> theotherthing

I was very entertained by the choice of "spong" as a random variable  
name, as I know a prominent Spong...

> Such classes can be replaced by data-types, which turn out to also  
> be more general:
>
>   data Thingy = MkThingy
>   {
>       foo :: this,
>       bar :: that,
>       spong :: theotherthing
>   }
>
> My question: is there a reason not to use types? I'm particularly  
> interested in two cases that I've come across, references and  
> streams. Here's the class version:
>
>   class (Monad m) => Ref m r | r -> m, m -> r where
>       newRef   :: a -> m (r a)
>       readRef  :: r a -> m a
>       writeRef :: r a -> a -> m ()
> ....
> And here's the type version:
>
>   data Ref m a = MkRef
>   {
>       readRef  :: m a,
>       writeRef :: a -> m ()
>   }
>
>   class (Monad m) => HasRefs m where
>       newRef   :: a -> m (Ref m a)
> ...
> Is there any reason not to prefer using types?

This question set of some fascinating thoughts in my mind.  So...

In favor of using data types:

*  The object in question *is* its own dictionary.  Instead of  
passing two arguments to every function---a dictionary and a Ref,  
say---you pass only one.
*  No need to have a rigid choice of a single instance for a given  
type.  In principle you could write a wrapper of type (Ref m a -> Ref  
m a) which gave you "logging refs", for example.

Countervailing factors are almost all efficiency arguments:

*  You need to write both a type and a class in some cases (eg Ref)--- 
the only non-efficiency argument.
*  Dictionaries are usually strict and unlifted (no need for laziness  
or bottoms).  They thus require rather less machinery at run time.
*  Usually the record will be a series of closures over a single  
object.  These closures need to be allocated at run time and will  
take up space.  By contrast, the functions in a class don't require  
runtime allocation, unless they are involved in complicated games  
with polymorphic recursion.  Even then, dictionaries can be shared.
*  Compilers are pretty good at second-guessing the contents of  
dictionary arguments.  If it's the dictionary for Ref IO IORef, then  
there's only one possible choice for newRef, readRef, and writeRef.   
So if we know the type of the ref, we know which function we have to  
call.  With a relatively small amount of work, we can even specialize  
functions for the actual dictionaries we pass in---so that if our  
function is only ever used on IORefs, we plug in readIORef and  
writeIORef directly.  By contrast, we need to do some *very*  
sophisticated pointer analysis to learn that our record of type Ref  
IO IORef happens to always have a readRef field of the form  
(readIORef ioRef) for some ioRef.  This sort of analysis usually  
requires having the entire program on hand, and tends to be brittle  
even then since any heap analysis is flirting with undecidability.   
Jhc is the only current compiler which does this, but it also uses a  
very different dictionary representation which might well be more  
efficient than the standard one anyhow.

This is really the flip side of the "flexibility" argument above.  If  
your representation is flexible, the implementation has to assume you  
are going to use that flexibility until it can prove otherwise.

-Jan-Willem Maessen

> -- 
> Ashley Yakeley, Seattle WA
> WWEWDD? http://www.cs.utexas.edu/users/EWD/
>
> _______________________________________________
> Libraries mailing list
> Libraries at haskell.org
> http://www.haskell.org/mailman/listinfo/libraries