[Haskell-cafe] type and data constructors in CT
wren ng thornton
wren at freegeek.org
Mon Feb 2 07:29:45 EST 2009
Gregg Reynolds wrote:
> On Sat, Jan 31, 2009 at 4:26 PM, wren ng thornton <wren at freegeek.org> wrote:
> > > But a data constructor Dcon a is an /element/ mapping taking elements
> > > (values) of one type to elements of another type. So it too can be
> > > construed as a functor, if each type itself is construed as a
> > > category.
> >
> > Actually no, it's not a functor. It's a (collection of) morphism(s). Let's
> > again assume a single-argument Dcon for simplicity. The Haskell type |Dcon
> > :: forall a. a -> Tcon a| is represented by a collection of morphisms
> > |Dcon_{X} : X -> Tcon X| for each X in Ob(Hask).
>
> Ok, I see I elided that step. So my question is about the relation
> between the individual (specialized) Dcon and the associated Tcon.
> I.e. Dcon 3 is a value of type Tcon Int, inferred by the type system.
> So it looks to me like the relation between the Tcon functor and the
> Dcon functions is basically ad hoc.
Pretty much. Well, it's not ad hoc, it's algebraic. Data constructors,
by definition, don't perform any sort of introspection on the values
they're given. Looking at it sideways, they're 'polymorphic' in the
values of the type, and they preserve the structure of the value (namely
just storing it). This makes data constructors more well behaved than
arbitrary functions. The possible definitions of Haskell types are
formed by the algebra consisting of that id function, products of
algebras, coproducts of algebras, and equi-recursion (though removing
this and using iso-recursion is easier for CT treatment). This
well-behavedness is without even looking at polymorphic data constructors.
To take a different tactic, consider the free monoid. Like all free
things it is the monoid which consists of no more than what is necessary
to satisfy the monoid laws (that is, it doesn't cost or require anything
special; hence "free"). If you're familiar with Prolog, this is like the
idea of uninterpreted predicates. The value of foo(bar) is: foo(bar).
Since foo is uninterpreted, applying foo to bar means returning a value
that reifies the application of foo to bar. Uninterpreted predicates are
"free application"[1]. Data constructors in Haskell are just like this.
Applying (:) to x and xs results in the value (x:xs).
It is precisely because Dcons cannot inspect the values they are given
that makes it so easy for Tcons to be functors. Since the only thing
Dcons can do to values is id them up, this is definitionally structure
preserving. Note that it's possible to have functors which do inspect
the values but still preserve structure by transforming the values in
some consistent manner.
Because Dcons cannot do anything, this is why people often define
"smart" constructors for dealing with abstract types or for maintaining
invariants in types which have more structure than can be encoded in
ADTs (or GADTs). For example, the bytestring-trie package defines a Trie
type for patricia trees mapping ByteStrings to some value type. The
basic type for Trie a has Empty, Arc :: ByteString -> Maybe a -> Trie a,
and Branch :: Trie a -> Trie a -> Trie a. However, the actual abstract
type has a number of invariants which restrict the domain of the type to
be much smaller than what you would get with just that specification
(e.g. an Arc with no value can't recurse as an Arc, the arcs should be
collapsed into one).
Smart constructors are an important technique and they can capture many
interesting types which ADTs cannot, however when using smart
constructors you've stepped outside of the warm confines of what the
type system enforces for you. The Haskell type Trie a is not the same
thing as the abstract type it's used to represent. By stepping outside
of the type system, it's up to the programmer to prove that their
abstract type does adhere to the laws for functors, monoids, or
whatever. For ADTs, those proofs are already done for you (for functors
at least) due to their regular nature. And I do mean "regular", ADTs are
coproducts, products, and recursion just like the choice, extension, and
Klene star of regular expressions.
[1] Easing off from the free monoid, the free associative operator is
free (binary) application extended by an equivalence relation for
associativity. Thus, values generated by the free assoc-op represent
equivalence classes over the application trees which could generate it.
The free monoid is the same thing but extended by equivalence relations
for the identity element (which is often written by writing nothing,
hence it's also an equivalence relation over all possible insertions of
the identity element).
> > It's important to remember that Tcon is the object half of an *endo*functor
> > |Tcon : Hask -> Hask| and not just any functor. We can view the object half
> > of an endofunctor as a collection of morphisms on a category; not
> > necessarily real morphisms that exist within that category, but more like an
> > overlay on a graph. In some cases, this overlay forms a subcategory (that
> > is, they are all indeed morphisms in the original category). And this is
> > what we have with data constructors: they are (pieces of) the image of the
> > endofunctor from within the category itself.
>
> (unknotting arms...) Uh, er, hmm. I'm still having abstraction
> vertigo, since we have (I think) data constructors qua generic
> thingees that work at the level of the category, and the same,
> specialized to a type, qua functions that operate on the internals of
> the categorical objects. It's the moving back and forth from type and
> value that escapes me, and I'm not sure I'm even describing the issue
> properly. I shall go away and thing about this and then write the
> answer down.
It won't help the vertigo any, but we can tune the level of abstraction
in a few more ways. With some hand waving:
* A type constructor is a bundle of polymorphic functions.
* A polymorphic function is a bundle of monomorphic functions.
* A monomorphic function is a bundle of unit functions (that is,
functions from one specific value of the domain "value bundle" (aka
"type"), to one specific value of the codomain's "value bundle").
In some ways each of these levels are similar to the others in terms of
how they bundle things together so as to be "structure preserving" in
the necessary ways. But in some ways, each of the levels are different
in what it means to be "bundled". As you say, category theory does it's
best to ignore what goes on inside of objects, instead it's all about
what goes on between objects (and anything inside of objects is merely
"structure" to be preserved). Set theory and domain theory tend to focus
more on what goes on inside.
Perhaps the simplest way to think of the relation is that the Tycon
perspective yields a functor, whereas the Dcon perspective yields a
natural transformation (eta : Id -> F, for some endofunctor F). I
started writing up a post about this earlier in the context of
polymorphic functions being natural transformations. But general
polymorphic functions have some extra fiddly details that obscure the
message (hence not sending the mail). For type constructors, though,
it's straightforward.
I'm guessing that your intuitions about the fact that there's stuff
happening on multiple levels is coming from the natural transformation.
The first few times you run into them, they tend to have that
disorienting effect. Part of this disorientation, I think, stems from
the fact that natural transformations are structure-preserving
transformations from structure-preserving transformations to
structure-preserving transformation (say what?). This shouldn't be that
mindboggling since we frequently talk about functions from functions to
functions (e.g. fmap_{F} :: (a -> b) -> (F a -> F b)), but the level of
abstraction can make it difficult to pin down.
Personally, I think the confusion is compounded by the way most CT
tutorials introduce natural transforms. Usually they start with
something like "this is a category", then move onto "this is a functor",
then they say that functors are also morphisms in the category, Cat, of
small categories; which is all good. (And already we have the
morphism/functor and object/category vertigo.) But then, when explaining
natural transformations they say something about them being "like
functors but moreso", then everything comes crashing down because that
often gets people visualizing functors on categories of categories, but
those are just functors too. If we're looking at Cat, and we pick two
objects X and Y (aka categories C and D), and then pick two morphisms f
and g (aka functors F and G), a natural transformation is a structure
preserving transformation from f into g. A key difference between this
and a functor on Cat is that X and Y are fixed; we do not say "for all
X, Y, and f : X->Y" like we would for a functor.
YMMV, and all that.
--
Live well,
~wren
More information about the Haskell-Cafe
mailing list