[ghc-steering-committee] Record dot notation

Mon Feb 10 19:07:18 UTC 2020

This is sort of offtopic for the thread, but I thought I'd mention it
anyway, lately I've been thinking that my ideal design for record
field selectors would involve promoted GADTs, used to describe the
universe of fields that a record was comprised from. That is, the 'k'
in HasField (x :: k) r a becomes f a, for some GADT f, whose indices
describe the types of the potential fields which might be had by a
variety of records.

There's nothing preventing this at present apart from the need to have
promoted GADTs, but I'm not sure how it would fit in at all with this
syntax (especially as "field names" potentially become more
complicated terms with actual structure to them rather than simply
strings).

On Mon, 10 Feb 2020 at 13:55, Iavor Diatchki <iavor.diatchki at gmail.com> wrote:
>
> I think the nice thing about 4 and 5 is that  we can say that `.x` is just a new token---a record selector---much like `M.x` is a token is its own right (the `.` there is not really an "infix" operator).   We don't need any discussion of the "looseness" of operators, etc.
> The difference between 4 and 5 is what happens when there are 3 tokens in a row like this:
>
> x y .z    -- identifier, identifier, selector
>
> with 4 (the OCaml way) we have that this means:  x (y.z)
> with 5 (the JS way?) we have that this means:  (x y) .z
>
> I still think that 4 makes the most sense for Haskell because, I would like `f x .y` and `f x.y` to both mean `f (x.y)`.  The fact that selection has higher precedence than application also matches the design choice we've already made that record update has higher precedence too, so it matches `f x { y = True }` .
>
> Having thought a bit more about it, I'd update my vote to also include 5:
>
> 4 > 5 > 2 > 1 > 3
>
> (i.e., I like the token based approach, and favor "OCaml" style over "JS" style).
>
> -Iavor
>
>
>
>
>
>
>
> On Mon, Feb 10, 2020 at 9:20 AM Simon Peyton Jones via ghc-steering-committee <ghc-steering-committee at haskell.org> wrote:
>>
>> My point here is that option (5) is no more or less whitespace sensitive than option (3). Both need the same cases to figure what the period character in your code means. I think this is why Simon PJ has keyed this part of the debate to module qualification: that existing feature (not under debate) essentially breaks the symmetry here, meaning that we have more room to work with without breaking symmetry further.
>>
>>
>>
>> You’ve put it very well.  Indeed, we could key it even more tightly to module qualification, by making the lexical rule exactly the same: just as M.x is treated as binding super-tightly, so is r.x.   If you like, M.x is a lexeme, and so is r.x.  They even have the same connotation (the x component of M or r respectively).
>>
>>
>>
>> What about (f e).y, or f{-blah-}.y?  Well, you can’t write qualified modules that way; it becomes two or more lexemes.  I’d be perfectly content to do the same for record selections, using Joachim’s rule 5 for every case *except* the case that parses exactly like module qualifiers (modulo upper case vs lower case).
>>
>>
>>
>> TL;DR: I’m arguing that (f r.x) means (f (r.x)) just as (f M.x) means (f (M.x)).
>>
>>
>>
>> But I don’t care about
>>
>>                f (blah blah blah).x
>>
>> I’d be quite content with that meaning
>>
>>                f (bla blha blah) .x
>>
>> as Joachim suggests.
>>
>>
>>
>> Would that help to resolve this debate?   I had not thought of it in that way before, but Joachim and Richard have helped met to do so, thank you.
>>
>>
>>
>> Simon
>>
>>
>>
>> From: ghc-steering-committee <ghc-steering-committee-bounces at haskell.org> On Behalf Of Richard Eisenberg
>> Sent: 10 February 2020 12:14
>> To: Simon Marlow <marlowsd at gmail.com>
>> Cc: ghc-steering-committee <ghc-steering-committee at haskell.org>; Joachim Breitner <mail at joachim-breitner.de>
>> Subject: Re: [ghc-steering-committee] Record dot notation
>>
>>
>>
>> Upon careful consideration, I think the whitespace concerns here are somewhat ill-founded.
>>
>>
>>
>> First, please see https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0229-whitespace-bang-patterns.rst#proposed-change-specification, where (among other points), a careful description of "loose infix" vs "prefix" vs "suffix" vs "tight infix" is discussed. Here is a set of examples:
>>
>> a ! b   -- a loose infix occurrence
>>
>> a!b     -- a tight infix occurrence
>>
>> a !b    -- a prefix occurrence
>>
>> a! b    -- a suffix occurrence
>>
>> This distinction is *not* just made by example, but that proposal (which has been accepted) defines these precisely. So, the comments on this thread about what counts as a naked selector are addressed: a naked selector is one where the dot is a prefix occurrence.
>>
>>
>>
>> Other whitespace-wariness comes from worrying about the distinction between prefix and tight infix occurrences. That is, should we differentiate between the interpretation of `f r.x` and `f r .x`. Yet in all versions of any of this, we differentiate between loose infix and the others. Thus there is *always* whitespace-sensitivity around dot. Note that this is true, as Simon PJ pointed out, regardless of this proposal, where a tight-infix usage of a dot with a capitalized identifier on the left is taken as a module qualification. In all of its versions, this proposal *increases* the whitespace sensitivity, by further distinguishing between prefix occurrences of dot and other usages.
>>
>>
>>
>> Let's compare options 3 and 5 with this analysis then:
>>
>>
>>
>> Option 3:
>>
>> loose-infix: whatever (.) is in scope
>>
>> tight-infix:
>>
>>   - if left-hand is a capitalized identifier: module qualification
>>
>>   - otherwise: record selection, binding tighter than function application
>>
>> prefix: postfix record selection, binding like function application
>>
>> suffix: presumably, whatever (.) is in scope
>>
>>
>>
>> Option 5:
>>
>> loose-infix: whatever (.) is in scope
>>
>> tight-infix:
>>
>>  - if left-hand is a capitalized identifier: module qualification
>>
>>  - otherwise: postfix record selection, binding like function application
>>
>> prefix: postfix record selection, binding like function application
>>
>> suffix: presumably, whatever (.) is in scope
>>
>>
>>
>> My point here is that option (5) is no more or less whitespace sensitive than option (3). Both need the same cases to figure what the period character in your code means. I think this is why Simon PJ has keyed this part of the debate to module qualification: that existing feature (not under debate) essentially breaks the symmetry here, meaning that we have more room to work with without breaking symmetry further.
>>
>>
>>
>> My vote is thus:
>>
>>
>>
>> 3 > 5 > 2 > 4 > 1
>>
>>
>>
>> Other points of motivation:
>>
>> - Despite my argument above, I see the merit in (5). I just think that an argument "we don't want dot to be whitespace-sensitive" isn't really effective.
>>
>> - I want to accept this proposal. We're not going to get another go at this.
>>
>> - I really don't like the way record-update binds, and (4) reminds me too much of that.
>>
>>
>>
>> Richard
>>
>>
>>
>> On Feb 10, 2020, at 9:58 AM, Simon Marlow <marlowsd at gmail.com> wrote:
>>
>>
>>
>> On Fri, 7 Feb 2020 at 22:37, Joachim Breitner <mail at joachim-breitner.de> wrote:
>>
>>
>> I really would prefer a design where all these questions do not even
>> need to be asked…
>>
>>
>>
>> Me too. Also what about (.x) vs. ( .x), are those the same?
>>
>>
>>
>> So I think to have the full picture, we need the following option as
>> well on the ballot:
>>
>>  5. .x is a postfix operator, binding exactly like application,
>>     whether it is naked or not.
>>     (This is option 3, but without the whitespace-sensitivity.)
>>
>>
>>
>> [...]
>>
>>
>>
>> Anyways, now for my opinion: Assuming no more options are added, my
>> ranking will be
>>
>>   5 > 4 > 2 > 1 > 3
>>
>> This puts first the two variants where .x behaves like an existing
>> language feature (either like function application or like record
>> updates), has no whitespace sensitivity, and follows existing languages
>> precedence (JS and OCaml, resp.).
>> Then the compromise solution that simply forbids putting spaces before
>> .x (so at least the program doesn't change semantics silently).
>> I dislike variant 3, which adds a _new_ special rule, and where adding
>> a single space can change the meaning of the program, so I rank that
>> last.
>>
>>
>>
>> I'm also against whitespace-sensitivity and I lean towards this ordering too.
>>
>> But I'm going with:
>>
>>
>>
>> 5 > 2 > 1 > 4 > 3
>>
>>
>>
>> Rationale: (5) seems the easiest to explain and has the fewest special cases, yet covers the use-cases we're interested in. Beyond that I want to be conservative because I find it hard to predict the ramifications of the more-complex alternatives 4/3, so I've put 2/1 ahead of those. I've made my peace with the current record selection syntax binding more tightly than application, and indeed I often rely on it to avoid a $, so I'm OK with 4 over 3.
>>
>>
>>
>> Cheers
>>
>> Simon
>>
>>
>>
>>
>>
>>
>>
>> Cheers,
>> Joachim
>>
>>
>> PS, because its on my mind, and just for fun:
>>
>> Under variant 3, both foo1 and foo2 typecheck, they do quite different
>> things (well, one loops).
>>
>>   data Stream a = Stream { val :: a, next :: Stream a }
>>
>>   foo1 f s = Stream (s.val) (foo1 (fmap f s).next)
>>   foo2 f s = Stream (s.val) (foo2 (fmap f s) .next)
>>
>>
>> --
>> Joachim Breitner
>>   mail at joachim-breitner.de
>>   http://www.joachim-breitner.de/
>>
>>
>> _______________________________________________
>> ghc-steering-committee mailing list
>> ghc-steering-committee at haskell.org
>> https://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-steering-committee
>>
>> _______________________________________________
>> ghc-steering-committee mailing list
>> ghc-steering-committee at haskell.org
>> https://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-steering-committee
>>
>>
>>
>> _______________________________________________
>> ghc-steering-committee mailing list
>> ghc-steering-committee at haskell.org
>> https://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-steering-committee
>
> _______________________________________________
> ghc-steering-committee mailing list
> ghc-steering-committee at haskell.org
> https://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-steering-committee