Records in Haskell
Matthew Farkas-Dyck
strake888 at gmail.com
Sat Jan 14 04:31:40 CET 2012
On 13/01/2012, Simon Peyton-Jones <simonpj at microsoft.com> wrote:
> Thanks to Greg for leading the records debate. I apologise that I
> don't have enough bandwidth to make more than an occasional
> contribution. Greg's new wiki page, and the discussion so far has
> clarified my thinking, and this message tries to express that new
> clarity. I put a conclusion at the end.
>
> Simon
>
> Overview
> ~~~~~~~~
> It has become clear that there are two elements to pretty much all the
> proposals we have on the table. Suppose we have two types, 'S' and 'T',
> both with a field 'f', and you want to select field 'f' from a record 'r'.
> Somehow you have to disambiguate which 'f' you mean.
>
> (Plan A) Disambiguate using qualified names. To select field f, say
> (S.f r) or (T.f r) respectively.
>
> (Plan B) Disambiguate using types. This approach usually implies
> dot-notation.
> If (r::S), then (r.f) uses the 'f' from 'S', and similarly if
> (r::T).
>
> Note that
>
> * The Frege-derived records proposal (FDR), uses both (A) and (B)
> http://hackage.haskell.org/trac/ghc/wiki/Records/NameSpacing
>
> * The Simple Overloaded Record Fields (SORF) proposal uses only (B)
> http://hackage.haskell.org/trac/ghc/wiki/Records/OverloadedRecordFields
>
> * The Type Directed Name Resolution proposal (TDNR) uses only (B)
>
> http://hackage.haskell.org/trac/haskell-prime/wiki/TypeDirectedNameResolution
>
> I know of no proposal that advocates only (A). It seems that we are agreed
> that we must make use of types to disambigute common cases.
>
> Complexities of (Plan B)
> ~~~~~~~~~~~~~~~~~~~~~~~~
> Proposal (Plan B) sounds innocent enough. But I promise you, it isn't.
> There has ben some mention of the "left-to-right" bias of Frege type
> inference engine; indeed the wohle explanation of which programs are
> accepted and which are rejected, inherently involves an understanding
> of the type inference algorithm. This is a Very Bad Thing when the
> type inference algorithm gets complicated, and GHC's is certainly
> complicated.
>
> Here's an example:
>
> type family F a b
> data instance F Int [a] = Mk { f :: Int }
>
> g :: F Int b -> ()
> h :: F a [Bool] -> ()
>
> k x = (g x, x.f, h x)
>
> Consider type inference on k. Initially we know nothing about the
> type of x.
> * From the application (g x) we learn that x's type has
> shape (F Int <something>).
> * From the application (h x) we learn that x's type has
> shape (F <something else> [Bool])
> * Hence x's type must be (F Int [Bool])
> * And hence, using the data family we can see which field
> f is intended.
>
> Notice that
> a) Neither left to right nor right to left would suffice
> b) There is significant interaction with type/data families
> (and I can give you more examples with classes and GADTs)
> c) In passing we note that it is totally unclear how (Plan A)
> would deal with data families
>
> This looks like a swamp. In a simple Hindley-Milner typed language
> you might get away with some informal heuristics, but Haskell is far
> too complicated.
>
> Fortunately we know exactly what to do; it is described in some detail
> in our paper "Modular type inference with local assumptions"
> http://www.haskell.org/haskellwiki/Simonpj/Talk:OutsideIn
>
> The trick is to *defer* all these decisions by generating *type constraints*
> and solving them later. We express it like this:
>
> G, r:t1 |- r.f : t2, (Has t1 "f" t2)
>
> This says that if r is in scope with type t1, then (r.f) has type t2,
> plus the constraint (Has t1 "f" t2), which we read as saying
>
> Type t1 must have a field "f" of type t2
>
> We gather up all the constraints and solve them. In solving them
> we may figure out t1 from some *other* constraint (to the left or
> right, it's immaterial. That allow us to solve *this* constraint.
>
> So it's all quite simple, uniform, and beautiful. It'll fit right
> into GHC's type-constraint solver.
>
> But note what has happened: we have simply re-invented SORF. So the
> conclusion is this:
>
> the only sensible way to implement FDR is using SORF.
>
> What about overloading?
> ~~~~~~~~~~~~~~~~~~~~~~~
> A feature of SORF is that you can write functions like this
>
> k :: Has r "f" Int => r -> Int
> k r = r.f + 1
>
> Function 'k' works on any record that has a field 'f'. This may be
> cool, but it wasn't part of our original goal. And indeed neither FDR
> nor TDNR offer it.
>
> But, the Has constraints MUST exist, in full glory, in the constraint
> solver. The only question is whether you can *abstract* over them.
> Imagine having a Num class that you could not abstract over. So you
> could write
>
> k1 x = x + x :: Float
> k2 x = x + x :: Integer
> k3 x = x + x :: Int
>
> using the same '+' every time, which generates a Num constraint. The
> type signature fixes the type to Float, Integer, Int respectively, and
> tells you which '+' to use. And that is exactly what ML does!
>
> But Haskell doesn't. The Coolest Thing about Haskell is that you get
> to *abstract* over those Num constraints, so you can write
>
> k :: Num a => a -> a
> k x = x + x
>
> and now it works over *any* Num type.
>
> On reflection, it would be absurd not to do ths same thing for Has
> constraints. If we are forced to have Has constraints internally, it
> woudl be criminal not to abstract over them. And that is precisely
> what SORF is.
>
>
> Is (Plan A) worth it?
> ~~~~~~~~~~~~~~~~
>
> Once you have (Plan B), and SORF in full glory, plus of course the
> existing ability to name fields T_f, S_f, if you want, I think it is
> highly questionable whether we need the additional complexities of
> (Plan A)?
>
> And I do think (Plan A) has lots of additional complexities that we
> have not begun to explore yet. The data-family thing above is an
> example, and I can think of some others.
>
> But even if it was simple, we still have to ask: does *any* additional
> complexity give enough payoff, if you already have SORF? I suspect
> not.
>
>
> Extensions to SORF
> ~~~~~~~~~~~~~~~~~~
> Frege lets you add "virtual fields" to a record type, using an extra
> RExtension mechanism that I do not yet understand. But SORF lets you
> do so with no additional mechanism. See "Virtual record selectors"
> on http://hackage.haskell.org/trac/ghc/wiki/Records/OverloadedRecordFields
> The point is that the existing type-class instance mechanisms do just
> what we want.
>
>
> Syntax
> ~~~~~~
> The debate on the mailing list has moved sharply towards discussing
> lexical syntax. I'm not going to engage in that discussion because
> while it is useful to air opinions, it's very hard to get agreement.
> But for the record:
>
> * I don't mind having Unicode alternatives, but there must be
> ASCII syntax too
>
> * I think we must use ordinary dot for field selection.
>
> * I think it's fine to insist on no spaces; we are already
> doing this for qualified names, as someone pointed out
>
> * I think we should not introduce new syntax unless we are
> absolutely forced into it. Haskell's record update syntax
> isn't great, but we have it.
Honestly, I would be glad to find that syntax deprecated/removed. It
is awkward and in my opinion not very haskellish.
The let-syntax is clean, but quite verbose, and likely suboptimal for
other reasons, such as Greg Weber posted to the wiki.
Nevertheless, the dot-let-syntax seems (to me at least) the rational
convergence of plain let-syntax and the new dot-selection syntax.
Perhaps we might:
* Define old record update syntax as syntactic sugar for let-syntax.
* Allow language extensions, such as Frege-syntax, as sugar for the same.
>
> Conclusion
> ~~~~~~~~~~
> I am driven to the conclusion that SORF is the way to go.
> - Every other proposal on the table requires SORF (either
> visibly or invisibly)
> - When you have SORF, you don't really need anything else
I fully agree. Mind, I'm not a GHC God, but I fully agree nonetheless.
A powerful record system (not solely label namespaces) would be a
great win for Haskell. This is that record system.
> The blocking issues are described on
> http://hackage.haskell.org/trac/ghc/wiki/Records/OverloadedRecordFields
>
> a) "Representation hiding" (look for that heading)
>
> b) "Record update" (ditto), most especially for records whose
> fields have polymorphic types
>
> If we fix these we can move forward.
>
>
>
> _______________________________________________
> Glasgow-haskell-users mailing list
> Glasgow-haskell-users at haskell.org
> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
>
More information about the Glasgow-haskell-users
mailing list