Constructors in GHC

Simon Peyton-Jones simonpj@microsoft.com
Wed, 11 Dec 2002 11:40:39 -0000


This message is to air a possible change in the way GHC handles
constructors.  Before I make the change I want to check that it isn't
going to mess anyone up.  There's some background in
=09
http://www.cse.unsw.edu.au/~chak/haskell/ghc/comm/the-beast/data-types.h
tml

Consider the following
	data T =3D MkT !(Int,Int)
	f x y =3D MkT (x,y)
	g (MkT (x,y)) =3D x

If you compile this with -funbox-strict-fields, GHC will unbox the
strict pair, to give effectively this
	data T =3D MkT Int Int
	f x y =3D MkT x y
	g t =3D case t of MkT x y -> x

Rather than find all the applications of MkT, it actually *defines* MkT
like this

	MkT p =3D case p of (x,y) -> $wMkT x y

where $wMkT is the "real" constructor.  So the "real" data type is

	data T =3D $wMkT Int Int

and MkT is just a "wrapper function".  However in Core-language case
expressions we still print 'MkT':
=09
	g t =3D case t of
	  	 MkT a b -> ...

even though the "real" constructor is $wMkT.  (On the face of it, MkT a
b isn't even well typed.)

This is a bit of a mess, especially when we print External Core.  Then
we get
	data T =3D MkT Int Int
	MkT p =3D case p of (x,y) -> $wMkT x y
	f x y =3D $wMkT x y
	g t =3D case t of MkT x y -> x

Strange!  MkT looks like the constructor for T, but is also given a
definition; and $wMkT doesn't seem to be defined at all.  This gives
rise to difficulties when reading External Core back in.


We could make this more consistent in two ways.  Alternative (A): One
way would be to make it clearer that $wMkT was the real constructor:

	data T =3D $wMkT Int Int
	MkT p =3D case p of (x,y) -> $wMkT x y
	f x y =3D $wMkT x y
	g t =3D case t of $wMkT x y -> x

This is consistent, but it makes External Core a bit funny.  (The real
constructors are always $w things.) =20

Alternative (B): The other alternative would be to make the original
Haskell constructors into the $w things:

	data T =3D MkT Int Int
	$wMkT p =3D case p of (x,y) -> MkT x y
	f x y =3D MkT x y
	g t =3D case t of MkT x y -> x

This makes Core (and External Core) nice and consistent, with
traditional upper-case constructors, but MkT now has a different type
than in the original program.  We'd need to take care when printing type
errors etc that we didn't print $wMkT when the programmer expected MkT.
(That isn't too hard.)


Personally, I'm inclined to alternative (B).  Do any of you have an
opinion?  Especially folk using External Core?

Simon