[ghc-steering-committee] Record dot notation

Iavor Diatchki iavor.diatchki at gmail.com
Mon Feb 10 18:55:09 UTC 2020


I think the nice thing about 4 and 5 is that  we can say that `.x` is just
a new token---a record selector---much like `M.x` is a token is its own
right (the `.` there is not really an "infix" operator).   We don't need
any discussion of the "looseness" of operators, etc.
The difference between 4 and 5 is what happens when there are 3 tokens in a
row like this:

x y .z    -- identifier, identifier, selector

with 4 (the OCaml way) we have that this means:  x (y.z)
with 5 (the JS way?) we have that this means:  (x y) .z

I still think that 4 makes the most sense for Haskell because, I would like
`f x .y` and `f x.y` to both mean `f (x.y)`.  The fact that selection has
higher precedence than application also matches the design choice we've
already made that record update has higher precedence too, so it matches `f
x { y = True }` .

Having thought a bit more about it, I'd update my vote to also include 5:

4 > 5 > 2 > 1 > 3

(i.e., I like the token based approach, and favor "OCaml" style over "JS"
style).

-Iavor







On Mon, Feb 10, 2020 at 9:20 AM Simon Peyton Jones via
ghc-steering-committee <ghc-steering-committee at haskell.org> wrote:

> My point here is that option (5) is no more or less whitespace sensitive
> than option (3). Both need the same cases to figure what the period
> character in your code means. I think this is why Simon PJ has keyed this
> part of the debate to module qualification: that existing feature (not
> under debate) essentially breaks the symmetry here, meaning that we have
> more room to work with without breaking symmetry further.
>
>
>
> You’ve put it very well.  Indeed, we could key it even more tightly to
> module qualification, by making the lexical rule exactly the same: just as
> M.x is treated as binding super-tightly, so is r.x.   If you like, M.x is a
> lexeme, and so is r.x.  They even have the same connotation (the x
> component of M or r respectively).
>
>
>
> What about (f e).y, or f{-blah-}.y?  Well, you can’t write qualified
> modules that way; it becomes two or more lexemes.  I’d be perfectly content
> to do the same for record selections, using Joachim’s rule 5 for every case
> **except** the case that parses exactly like module qualifiers (modulo
> upper case vs lower case).
>
>
>
> TL;DR: I’m arguing that (f r.x) means (f (r.x)) just as (f M.x) means (f
> (M.x)).
>
>
>
> But I don’t care about
>
>                f (blah blah blah).x
>
> I’d be quite content with that meaning
>
>                f (bla blha blah) .x
>
> as Joachim suggests.
>
>
>
> Would that help to resolve this debate?   I had not thought of it in that
> way before, but Joachim and Richard have helped met to do so, thank you.
>
>
>
> Simon
>
>
>
> *From:* ghc-steering-committee <ghc-steering-committee-bounces at haskell.org>
> *On Behalf Of *Richard Eisenberg
> *Sent:* 10 February 2020 12:14
> *To:* Simon Marlow <marlowsd at gmail.com>
> *Cc:* ghc-steering-committee <ghc-steering-committee at haskell.org>;
> Joachim Breitner <mail at joachim-breitner.de>
> *Subject:* Re: [ghc-steering-committee] Record dot notation
>
>
>
> Upon careful consideration, I think the whitespace concerns here are
> somewhat ill-founded.
>
>
>
> First, please see
> https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0229-whitespace-bang-patterns.rst#proposed-change-specification,
> where (among other points), a careful description of "loose infix" vs
> "prefix" vs "suffix" vs "tight infix" is discussed. Here is a set of
> examples:
>
> a ! b   -- a loose infix occurrence
>
> a!b     -- a tight infix occurrence
>
> a !b    -- a prefix occurrence
>
> a! b    -- a suffix occurrence
>
> This distinction is *not* just made by example, but that proposal (which
> has been accepted) defines these precisely. So, the comments on this thread
> about what counts as a naked selector are addressed: a naked selector is
> one where the dot is a prefix occurrence.
>
>
>
> Other whitespace-wariness comes from worrying about the distinction
> between prefix and tight infix occurrences. That is, should we
> differentiate between the interpretation of `f r.x` and `f r .x`. Yet in
> all versions of any of this, we differentiate between loose infix and the
> others. Thus there is *always* whitespace-sensitivity around dot. Note that
> this is true, as Simon PJ pointed out, regardless of this proposal, where a
> tight-infix usage of a dot with a capitalized identifier on the left is
> taken as a module qualification. In all of its versions, this proposal
> *increases* the whitespace sensitivity, by further distinguishing between
> prefix occurrences of dot and other usages.
>
>
>
> Let's compare options 3 and 5 with this analysis then:
>
>
>
> Option 3:
>
> loose-infix: whatever (.) is in scope
>
> tight-infix:
>
>   - if left-hand is a capitalized identifier: module qualification
>
>   - otherwise: record selection, binding tighter than function application
>
> prefix: postfix record selection, binding like function application
>
> suffix: presumably, whatever (.) is in scope
>
>
>
> Option 5:
>
> loose-infix: whatever (.) is in scope
>
> tight-infix:
>
>  - if left-hand is a capitalized identifier: module qualification
>
>  - otherwise: postfix record selection, binding like function application
>
> prefix: postfix record selection, binding like function application
>
> suffix: presumably, whatever (.) is in scope
>
>
>
> My point here is that option (5) is no more or less whitespace sensitive
> than option (3). Both need the same cases to figure what the period
> character in your code means. I think this is why Simon PJ has keyed this
> part of the debate to module qualification: that existing feature (not
> under debate) essentially breaks the symmetry here, meaning that we have
> more room to work with without breaking symmetry further.
>
>
>
> My vote is thus:
>
>
>
> 3 > 5 > 2 > 4 > 1
>
>
>
> Other points of motivation:
>
> - Despite my argument above, I see the merit in (5). I just think that an
> argument "we don't want dot to be whitespace-sensitive" isn't really
> effective.
>
> - I want to accept this proposal. We're not going to get another go at
> this.
>
> - I really don't like the way record-update binds, and (4) reminds me too
> much of that.
>
>
>
> Richard
>
>
>
> On Feb 10, 2020, at 9:58 AM, Simon Marlow <marlowsd at gmail.com> wrote:
>
>
>
> On Fri, 7 Feb 2020 at 22:37, Joachim Breitner <mail at joachim-breitner.de>
> wrote:
>
>
> I really would prefer a design where all these questions do not even
> need to be asked…
>
>
>
> Me too. Also what about (.x) vs. ( .x), are those the same?
>
>
>
> So I think to have the full picture, we need the following option as
> well on the ballot:
>
>  5. .x is a postfix operator, binding exactly like application,
>     whether it is naked or not.
>     (This is option 3, but without the whitespace-sensitivity.)
>
>
>
> [...]
>
>
>
> Anyways, now for my opinion: Assuming no more options are added, my
> ranking will be
>
>   5 > 4 > 2 > 1 > 3
>
> This puts first the two variants where .x behaves like an existing
> language feature (either like function application or like record
> updates), has no whitespace sensitivity, and follows existing languages
> precedence (JS and OCaml, resp.).
> Then the compromise solution that simply forbids putting spaces before
> .x (so at least the program doesn't change semantics silently).
> I dislike variant 3, which adds a _new_ special rule, and where adding
> a single space can change the meaning of the program, so I rank that
> last.
>
>
>
> I'm also against whitespace-sensitivity and I lean towards this ordering
> too.
>
> But I'm going with:
>
>
>
> 5 > 2 > 1 > 4 > 3
>
>
>
> Rationale: (5) seems the easiest to explain and has the fewest special
> cases, yet covers the use-cases we're interested in. Beyond that I want to
> be conservative because I find it hard to predict the ramifications of the
> more-complex alternatives 4/3, so I've put 2/1 ahead of those. I've made my
> peace with the current record selection syntax binding more tightly than
> application, and indeed I often rely on it to avoid a $, so I'm OK with 4
> over 3.
>
>
>
> Cheers
>
> Simon
>
>
>
>
>
>
>
> Cheers,
> Joachim
>
>
> PS, because its on my mind, and just for fun:
>
> Under variant 3, both foo1 and foo2 typecheck, they do quite different
> things (well, one loops).
>
>   data Stream a = Stream { val :: a, next :: Stream a }
>
>   foo1 f s = Stream (s.val) (foo1 (fmap f s).next)
>   foo2 f s = Stream (s.val) (foo2 (fmap f s) .next)
>
>
> --
> Joachim Breitner
>   mail at joachim-breitner.de
>   http://www.joachim-breitner.de/
>
>
> _______________________________________________
> ghc-steering-committee mailing list
> ghc-steering-committee at haskell.org
> https://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-steering-committee
>
> _______________________________________________
> ghc-steering-committee mailing list
> ghc-steering-committee at haskell.org
> https://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-steering-committee
>
>
> _______________________________________________
> ghc-steering-committee mailing list
> ghc-steering-committee at haskell.org
> https://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-steering-committee
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-steering-committee/attachments/20200210/36f4a394/attachment-0001.html>


More information about the ghc-steering-committee mailing list