[Haskell-cafe] type inference and named fields

Fri Jun 24 06:44:23 EDT 2005

Christian Maeder <maeder at tzi.de> writes:

> Even consecutive updates (that could be recognized by
> the desugarer) don't work:
> 
> voidcast v at VariantWithTwo{} = v { field1 = Void} {field2 = Void }

Yes, I find it interesting that consecutive updates are not equivalent
to a combined update.  I believe this is largely because named fields
are defined as sugar - they behave in some sense like textual macros
in other languages, which can often turn out to have unexpected
properties.

Here is a strawman proposal, which does not fix the consecutive update
problem, but which might address the original typing problem.

The Report (section 3.15) defines the following conditions on a named
field update:

    * all labels must be taken from the same datatype
    * at least one constructor must define all of the labels mentioned
      in the update
    * no label may be mentioned more than once
    * an execution error occurs when the value being updated does
      not contain all of the specified labels.

Now, this last condition is rather nasty - a runtime error - and I
would like to eliminate it.  I think we can do so by also eliminating
the second condition.  That is, I would like to be able to use a set of
labels that might not be contained solely within a single constructor.
The obvious semantics is that if a constructed value does not contain
a label of that type, that part of the update is discarded.  Hence,
I could write a single expression that updated all possible variants
of a type, simply by unioning all their possible labels.

e.g.

> data Fields a =
>     VariantWithOne { field1 :: a }
>   | VariantWithTwo { field2 :: a }
>
> voidcast :: Fields a -> Fields Void
> voidcast v = v { field1 = Void , field2 = Void }

The only change required in the desugaring translation specified in
the Report, is to replace the default case match from
    _ -> error "Update error"
to
    v -> v
(The desugaring does not currently seem to implement condition 2,
and the original case default only partially addresses condition 4.)

What might be the downside of such a change?  Well, an update
expression (such as in my example above) could confuse the programmer
into thinking that the value contains at least two fields, when it
in fact contains only one.  Likewise, it may suggest strongly that
a certain value associated with a label will afterwards be stored
within the variable (as now), whereas it is possible that the value
afterwards appears nowhere within the variable (because the label
was not a member).

How do these problems weigh against the increased convenience of
the proposal?  I think they are very slight, and the benefit is
potentially larger.  It would break no old programs.  It would permit
some current programs to be more succinct.

Regards,
    Malcolm