[Haskell-cafe] Some thoughts on Type-Directed Name Resolution -record update

Sat Feb 11 01:13:07 CET 2012

On 10/02/2012 03:22, Donn Cave wrote:
>    modifyRecord :: RecordType r =>  (a ->  a) ->  (r ->  a) ->  r ->  r

>    data Config { tempo :: Int, ...}
>    f = modifyRecord tempo (+20)

I'm hoping I missed something, and that you don't intend the "(r -> a)" 
part of this in particular to be taken literally.

If you intend something to be used as a field identifier, you should 
give it a type that says that. A function (r -> a) could be anything - 
even a use of const that ignores the value for r. Having a type class 
for field ids, parameterized by the record type and the field type, 
would make more sense. Having that, you could treat a record a bit like 
an IORef - use polymorphic read, write and modify functions. "Like an 
IORef" doesn't mean mutability, of course.

Personally, though, I still quite like the existing update notation, at 
least for updating multiple fields at once. Some tweaks could be nice - 
particularly for taking an (a -> a) function rather than an (a) value 
for each field.

What I might do (note - spelling choices not sanity checked).

data MyRecordT = MyRec {= field01 :: Int, field02 :: Int =}
   --  New notation to avoid breaking changes where old notation is used.
   --  This notation should not provide module-level field access functions.

example_read_1 :: FieldID MyRecordT Int -> MyRecordT -> Int
example_read_1 fid r = readRec fid r

example_read_2 :: MyRecordT -> Int
example_read_2 = readRec MyRecordT.field01
   --  This differs from my earlier <typename>.<fieldname> gives field
   --  access function idea. More clutter here, but for a reason.

example_read_3 :: MyRecordT -> Int
example_read_3 r = readRec MyRec.field01 r
   --  This would only match field01 in the MyRec constructor case, even
   --  if there were other constructors for the same type. The same type as
   --  MyRecordT.field01, but a different value.

example_read_4 :: MyRecordT -> Int
example_read_4 r = r.field01
   --  Direct read of a field in a known record value is a common case,
   --  and avoiding the explicit readRec avoids clutter.

example_modify_1 :: FieldID MyRecordT Int -> (Int -> Int) -> MyRecordT 
-> MyRecordT
example_modify_1 fid fun r = modifyRec fid fun r
   --  For longhand, support modifyRec and writeRec functions

example_modify_2 :: MyRecordT-> (Int -> Int)-> MyRecordT
example_modify_2 r fun = r { field01 fun, field02 (+field01) }
   --  First item of each pair indicates field. Lack of = indicates use 
a function.
   --  Within the braces, all the functions (second item each pair) see 
an environment
   --  containing all field names, referring to the original value of 
the record.
   --  Pair doesn't mean tuple here - just two subexpressions separated 
by whitespace.
   --  Maybe <- or <-> separator would be better instead.

example_modify_3 :: MyRecordT -> FieldID MyRecordT Int -> MyRecordT
example_modify_3 r fid = r { fid (+1) }
   --  First item of each pair still accepts arbitrary field IDs. Also, 
could
   --  use MyRec.field01 - to only allow matching that field in that 
constructor.
   --  Probably require parens for anything other than a single 
identifier in the
   --  field-id subexpression.

example_modify_4 :: MyRecordT -> MyRecordT
example_modify_4 r = r { field01 (\_ -> readRec r.field02 r) }
   --  The dot still supplies field IDs, overriding the 
names-for-initial-values
   --  environment, for record types, data constructors, and record 
values. Using
   --  the shorthand for any reads of original field values is not 
compulsory.

example_modify_5 :: MyRecordT -> MyRecordT
example_modify_5 r = r { field01 = field02, field02 (+1) }
   --  The "=" is still available too - mixing of write and modify cases 
supported

example_modify_6 :: MyRecordT -> MyRecordT
example_modify_6 = MyRecordT { field01 = field02, field02 = field01 }
   --  Allow update shorthand for the type and for the data constructor too,
   --  giving a function as normal.

On scope, the braces imply certain environment-defining rules. Also, the 
dot introduces a very short lived environment providing the field names. 
If the record type and/or data constructor is in scope, and the field 
names are made public by the module that defines them, these notations 
should just work - can't (and no need to) explicitly import the field 
names, which would import them into the wrong scope anyway. Importing 
the type name brings the field names along with it for type- and 
value-related scopes (such as <typename>.<fieldname> and 
<value>.<fieldname>) and importing the data constructor name brings the 
field names along with it for constructor-related scopes (such as 
<datacons>.<fieldname>).

Ban having type name, data constructor name or field name the same for 
the same type, except that a field name can occur within several data 
constructors for the same type - but only when using this record syntax. 
This is in part to avoid confusion within the braces notation for 
update. There is no ambiguity in principle, though, because the new 
environments hide any conflicting identifiers (type/constructor names 
included) by making the field names visible.

On the use of dots - I only have around 100 keys on my keyboard and most 
or all available symbols from those keys seem to be used. This use of 
dot still seems to me like just an extension of dot for 
<modulename>.<name> for accessing names in a module-based namespace. 
Further, if the left hand item is a type or value expression, there is 
no ambiguity with dot for function composition - though this is likely a 
fatal issue for my idea of allowing dots with data constructor names. 
Still, we make do without within-particular-constructor field selection 
now, so that's not such a big deal.