thoughts on the record update problem
Barney Hilken
b.hilken at ntlworld.com
Mon Mar 5 19:36:23 CET 2012
There are actually four problems with overloaded record update, not three as mentioned on the SORF page. This is an attempt to solve them.
The SORF update mechanism.
------------------------------
SORF suggests adding a member set to the class Has which does the actual updating just as get does the selecting. So
set :: Has r f t => t -> r -> r
and r {n1 = x1, n2 = x2} is translated as
set @ "n2" x2 (set @ "n1" x1)
The Problems.
-----------------
1. It's not clear how to define set for virtual record selectors. For example, we might define
data Complex = Complex {re :: Float, im :: Float}
instance Has Complex "arg" Float where
get r = atan2 r.im r.re
but if we want to set "arg", what should be kept constant? The obvious answer is "mod", but we haven't even defined it, and there are plenty of cases where there is no obvious answer.
2. If the data type has one or more parameters, updates can change the type of the record. Set can never do this, because of its type. What is more, if several fields depend on the parameter, for example
data Twice a = Twice {first :: a, second :: a}
any update of "first" which changes the type must also update "second" at the same time to keep the type correct. No hacked version of set can do this.
3. The Haskel implementation of impredicative polymorphism (from the Boxy Types paper) isn't strong enough to cope with higher rank field types in instances of set.
4. The translation of multiple updates into multiple applications of set is not the same as the definition of updates in the Haskel report, where updates are simultaneous not sequential. This would be less efficient, and in the case of virtual record selectors, it wouldn't be equal, and is arguably incorrect.
Point 3 could possibly be fixed by improving the strength of the type system, but SPJ says this is a hard problem, and no-one else seems ready to tackle it. Points 1, 2 & 4 suggest that any solution must deal not with individual fields but with sets of fields that can sensibly be updated together.
The Proposed Solution.
--------------------------
This is an extension to SORF. I don't know if the same approach could be applied to other label systems.
1. Introduce a new form of class declaration:
class Rcls r where
r {n1 :: t1, n2 :: t2}
is translated as
class (Has r n1 t1, Has r n2 t2) => Rcls r where
setRcls :: t1 -> t2 -> r -> r
setRcls is used internally but hidden from the user.
2. Instances of record classes can use a special form of default. So
data Rec = Rec {n1 :: t1, n2 :: t2}
instance Rcls Rec
is translated as
instance Rcls Rec where
setRcls x1 y1 (Rec _ _) = Rec x1 y1
provided all the fields in the class occur in the data type with the correct types. In general, the definition of the update function is the same as the Haskel98 translation of update, solving problem 4.
3. The syntax of record updates must be changed to include the class:
r {Rcls| n1 = x1, n2 = x2}
is translated as
setRcls x1 x2 r
Updating a subset of the fields is allowed, so
r {Rcls| n1 = x1}
is translated as
setRcls x1 (r.n2) r
4. Non default instances use the syntax:
instance Rcls Rec where
r {Rcls| n1 = x1, n2 = x2} = ...x1..x2..
which is translated as
instance Rcls Rec where
setRcls x1 y1 r = ...x1..x2..
in order to allow virtual selectors. This solves problem 1, because updates are grouped together in a meaningful way. An extended example is given below.
5. Record classes can have parameters, so
class TwiceClass r where
r a {first :: a, second :: a}
data Twice a = Twice {first :: a, second :: a}
instance TwiceClass Twice
translates as
class TwiceClass r where
setTwiceClass :: a -> a -> r b -> r a
data Twice a = Twice {first :: a, second :: a}
instance TwiceClass Twice where
setTwiceClass x y (Twice _ _) = Twice x y
which allows updates to change the type correctly. This solves problem 2.
6. Problem 3 *almost* works. The translation of
class HRClass r where
r {rev :: forall a. [a] -> [a]}
is
class Has r "rev" (forall a. [a] -> [a]) => HRClass r where
setHRClass :: (forall a.[a] -> [a]) -> r -> r
which is fine as far as updating is concerned, but the context is not (currently) allowed by ghc. I have no idea whether allowing polymorphic types in contexts would be a hard problem for ghc or not. None of my attempted work-rounds have been entirely satisfactory, but I might have missed something.
Comments
-------------
1. This makes the "special syntax for Has" pretty useless. When you have a set of labels you want to use together, you usually want to use update as well as selection, so it's better to define a record class, and use that.
2. The record classes can also be used for controlling the scope of polymorphic functions. For example, if you want to use a label "name" with the assumption that it refers to the name of a person, define a class
class Person r where
r {name :: String}
and only create instances where the assumption is correct. Any functions polymorphic over the class Person can only be applied to instances you have declared. You can later use the same label for the name of a product
class Product r where
r {name :: String}
but it's a different class with its own instances and the type checker will complain if you apply Person code to Product types.
3. It feels a bit odd to have the class which controls selection functions (Has) automatically defined, once for all, but the classes which control update functions must be defined by the programmer, and instances declared manually. However, I haven't found any way to make any kind of multiple Has class work.
Example
--------------
The following example illustrates some of the things that are possible with this approach. We want to represent complex numbers as pairs of Floats:
data Complex1 = Complex1 {real :: Float, imag :: Float}
in order to update records, we define a class:
class Cartesian c where
c {real :: Float, imag :: Float}
instance Cartesian Complex1
but we also want to access complex numbers by modulus and argument, so we define virtual selectors:
class Polar c where
c {mod :: Float, arg :: Float}
instance Has Complex1 "mod" Float where
get (Complex1 x y) = sqrt (x * x + y * y)
instance Has Complex1 "arg" Float where
get (Complex1 x y) = atan2 y x
instance Polar Complex1 where
_ {Polar| mod = r, arg = th} = Complex1 (r * cos th) (r * sin th)
Note that we can update x and y by {Cartesian| real = x, imag = y} or r and theta by {Polar| mod = r, arg = theta} but we cant mix them: there is no way to simultaneously update x and theta, unless we define a new class to do that.
We can change the representation to cache mod and arg without changing the classes:
data Complex2 = Complex2 {real :: Float, imag :: Float, mod :: Float, arg :: Float}
now both update functions are virtual, though none of the selectors are:
instance Cartesian Complex2 where
_ {Cartesian| real = x, imag = y} = Complex2 x y (sqrt (x * x + y * y)) (atan2 y x)
instance Polar Complex2 where
_ {Polar| mod = r, arg = th} = Complex2 (r * cos th) (r * sin th) r th
Alternatively, we might want to use whichever representation was last updated:
data Complex3 = Complex3a {real :: Float, imag :: Float}
| Complex3b {mod :: Float, arg :: Float}
now everything is virtual:
instance Has Complex3 "real" Float where
get (Complex3a x y) = x
get (Complex3b r th) = r * cos th
instance Has Complex3 "imag" Float where
get (Complex3a x y) = y
get (Complex3b r th) = r * sin th
instance Cartesian Complex3 where
_ {Cartesian| real = x, imag = y} = Complex3a x y
instance Has Complex3 "mod" Float where
get (Complex3a x y) = sqrt (x * x + y * y)
get (Complex3b r th) = r
instance Has Complex3 "arg" Float where
get (Complex3a x y) = atan2 y x
get (Complex3b r th) = th
instance Polar Complex3 where
_ {Polar| mod = r, arg = th} = Complex3b r th
Sorry this is so long!
Barney.
More information about the Glasgow-haskell-users
mailing list