[Haskell-cafe] computational overhead of the "data" declarations

Tue Oct 28 16:08:27 EDT 2008

On Tue, Oct 28, 2008 at 12:13:11PM -0700, Daryoush Mehrtash wrote:
> I Haskell School of Expression  (p172), it says:
> 
> A newtype declaration is just like a data declaration, except that it can
> > only be used to defined data types with single constructor.  The new data
> > type is different from the analogous one created by a data declaration, in
> > that there is no computational overhead in having the constructor.... You
> > can think of a newtype as defining  a "new type" with exactly the same
> > structure, behaviour, and performance as the underlying type.
> >
> 
> 
> What is (or where do you see) the computational overhead of the "data"
> declrations?

Consider the data declaration

  data Foo = Bar Int | Baz Char String

When a value of type Foo is stored in memory, there will actually be
some memory set aside for the tag (either Bar or Baz) since pattern
matching may need to be performed at runtime.  This is different from
a C union type, where if you want such a tag you must store it
yourself.

However, consider

  data Bad = Bad Int

Some memory will still be set aside to hold the 'Bad' tag, although in
this case we can see that isn't necessary: a value of type Bad can
only have one possible constructor.  The solution is to use a newtype:

  newtype Good = Good Int

Now, when a value of type Bad is stored at runtime, it will simply
store an Int, with no tag.  The benefit of declaring a newtype like
this is that the typechecker will ensure that you cannot mix up Goods
and Ints, although at runtime they will have the same representation.

So, to answer your question, the only computational overhead with a
data declaration is the extra memory and time to store and process the
constructor tags.  It usually isn't noticeable, but sometimes the
difference can be important.

-Brent