Records in Haskell

Sun Jan 15 21:33:03 CET 2012

On 13/01/2012, Simon Peyton-Jones <simonpj at microsoft.com> wrote:
> Thanks to Greg for leading the records debate.  I apologise that I
> don't have enough bandwidth to make more than an occasional
> contribution.  Greg's new wiki page, and the discussion so far has
> clarified my thinking, and this message tries to express that new
> clarity.  I put a conclusion at the end.
>
> Simon
>
> Overview
> ~~~~~~~~
> It has become clear that there are two elements to pretty much all the
> proposals we have on the table.  Suppose we have two types, 'S' and 'T',
> both with a field 'f', and you want to select field 'f' from a record 'r'.
> Somehow you have to disambiguate which 'f' you mean.
>
> (Plan A) Disambiguate using qualified names.  To select field f, say
>         (S.f r) or (T.f r) respectively.
>
> (Plan B) Disambiguate using types. This approach usually implies
> dot-notation.
>          If  (r::S), then (r.f) uses the 'f' from 'S', and similarly if
> (r::T).
>
> Note that
>
> * The Frege-derived records proposal (FDR), uses both (A) and (B)
>   http://hackage.haskell.org/trac/ghc/wiki/Records/NameSpacing
>
> * The Simple Overloaded Record Fields (SORF) proposal uses only (B)
>   http://hackage.haskell.org/trac/ghc/wiki/Records/OverloadedRecordFields
>
> * The Type Directed Name Resolution proposal (TDNR) uses only (B)
>
> http://hackage.haskell.org/trac/haskell-prime/wiki/TypeDirectedNameResolution
>
> I know of no proposal that advocates only (A).  It seems that we are agreed
> that we must make use of types to disambigute common cases.
>
> Complexities of (Plan B)
> ~~~~~~~~~~~~~~~~~~~~~~~~
> Proposal (Plan B) sounds innocent enough.  But I promise you, it isn't.
> There has ben some mention of the "left-to-right" bias of Frege type
> inference engine; indeed the wohle explanation of which programs are
> accepted and which are rejected, inherently involves an understanding
> of the type inference algorithm.  This is a Very Bad Thing when the
> type inference algorithm gets complicated, and GHC's is certainly
> complicated.
>
> Here's an example:
>
>    type family F a b
>    data instance F Int [a] = Mk { f :: Int }
>
>    g :: F Int b  -> ()
>    h :: F a [Bool] -> ()
>
>    k x = (g x, x.f, h x)
>
> Consider type inference on k.  Initially we know nothing about the
> type of x.
>  * From the application (g x) we learn that x's type has
>    shape (F Int <something>).
>  * From the application (h x) we learn that x's type has
>    shape (F <something else> [Bool])
>  * Hence x's type must be (F Int [Bool])
>  * And hence, using the data family we can see which field
>    f is intended.
>
> Notice that
>  a) Neither left to right nor right to left would suffice
>  b) There is significant interaction with type/data families
>     (and I can give you more examples with classes and GADTs)
>  c) In passing we note that it is totally unclear how (Plan A)
>     would deal with data families
>
> This looks like a swamp.  In a simple Hindley-Milner typed language
> you might get away with some informal heuristics, but Haskell is far
> too complicated.
>
> Fortunately we know exactly what to do; it is described in some detail
> in our paper "Modular type inference with local assumptions"
> http://www.haskell.org/haskellwiki/Simonpj/Talk:OutsideIn
>
> The trick is to *defer* all these decisions by generating *type constraints*
> and solving them later.  We express it like this:
>
>    G, r:t1  |-  r.f : t2,  (Has t1 "f" t2)
>
> This says that if r is in scope with type t1, then (r.f) has type t2,
> plus the constraint (Has t1 "f" t2), which we read as saying
>
>    Type t1 must have a field "f" of type t2
>
> We gather up all the constraints and solve them.  In solving them
> we may figure out t1 from some *other* constraint (to the left or
> right, it's immaterial. That allow us to solve *this* constraint.
>
> So it's all quite simple, uniform, and beautiful.  It'll fit right
> into GHC's type-constraint solver.
>
> But note what has happened: we have simply re-invented SORF.  So the
> conclusion is this:
>
>    the only sensible way to implement FDR is using SORF.
>
> What about overloading?
> ~~~~~~~~~~~~~~~~~~~~~~~
> A feature of SORF is that you can write functions like this
>
>    k :: Has r "f" Int => r -> Int
>    k r = r.f + 1
>
> Function 'k' works on any record that has a field 'f'.  This may be
> cool, but it wasn't part of our original goal.  And indeed neither FDR
> nor TDNR offer it.
>
> But, the Has constraints MUST exist, in full glory, in the constraint
> solver.  The only question is whether you can *abstract* over them.
> Imagine having a Num class that you could not abstract over. So you
> could write
>
>    k1 x = x + x :: Float
>    k2 x = x + x :: Integer
>    k3 x = x + x :: Int
>
> using the same '+' every time, which generates a Num constraint. The
> type signature fixes the type to Float, Integer, Int respectively, and
> tells you which '+' to use.  And that is exactly what ML does!
>
> But Haskell doesn't.  The Coolest Thing about Haskell is that you get
> to *abstract* over those Num constraints, so you can write
>
>   k :: Num a => a -> a
>   k x = x + x
>
> and now it works over *any* Num type.
>
> On reflection, it would be absurd not to do ths same thing for Has
> constraints.  If we are forced to have Has constraints internally, it
> woudl be criminal not to abstract over them.  And that is precisely
> what SORF is.
>
>
> Is (Plan A) worth it?
> ~~~~~~~~~~~~~~~~
>
> Once you have (Plan B), and SORF in full glory, plus of course the
> existing ability to name fields T_f, S_f, if you want, I think it is
> highly questionable whether we need the additional complexities of
> (Plan A)?
>
> And I do think (Plan A) has lots of additional complexities that we
> have not begun to explore yet.  The data-family thing above is an
> example, and I can think of some others.
>
> But even if it was simple, we still have to ask: does *any* additional
> complexity give enough payoff, if you already have SORF?  I suspect
> not.
>
>
> Extensions to SORF
> ~~~~~~~~~~~~~~~~~~
> Frege lets you add "virtual fields" to a record type, using an extra
> RExtension mechanism that I do not yet understand.  But SORF lets you
> do so with no additional mechanism.  See "Virtual record selectors"
> on http://hackage.haskell.org/trac/ghc/wiki/Records/OverloadedRecordFields
> The point is that the existing type-class instance mechanisms do just
> what we want.
>
>
> Syntax
> ~~~~~~
> The debate on the mailing list has moved sharply towards discussing
> lexical syntax.  I'm not going to engage in that discussion because
> while it is useful to air opinions, it's very hard to get agreement.
> But for the record:
>
>   * I don't mind having Unicode alternatives, but there must be
>     ASCII syntax too
>
>   * I think we must use ordinary dot for field selection.
>
>   * I think it's fine to insist on no spaces; we are already
>     doing this for qualified names, as someone pointed out
>
>   * I think we should not introduce new syntax unless we are
>     absolutely forced into it.  Haskell's record update syntax
>     isn't great, but we have it.
>
>
> Conclusion
> ~~~~~~~~~~
> I am driven to the conclusion that SORF is the way to go.
>  - Every other proposal on the table requires SORF (either
>    visibly or invisibly)
>  - When you have SORF, you don't really need anything else
>
> The blocking issues are described on
>   http://hackage.haskell.org/trac/ghc/wiki/Records/OverloadedRecordFields
>
>   a) "Representation hiding" (look for that heading)
>
>   b) "Record update" (ditto), most especially for records whose
>      fields have polymorphic types

I posted to the wiki a possible solution to (b):
http://hackage.haskell.org/trac/ghc/wiki/Records/OverloadedRecordFields#AlternativeProposal

> If we fix these we can move forward.
>
>
>
> _______________________________________________
> Glasgow-haskell-users mailing list
> Glasgow-haskell-users at haskell.org
> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
>