[ghc-steering-committee] Record dot notation

Simon Peyton Jones simonpj at microsoft.com
Mon Feb 10 17:19:58 UTC 2020


My point here is that option (5) is no more or less whitespace sensitive than option (3). Both need the same cases to figure what the period character in your code means. I think this is why Simon PJ has keyed this part of the debate to module qualification: that existing feature (not under debate) essentially breaks the symmetry here, meaning that we have more room to work with without breaking symmetry further.

You’ve put it very well.  Indeed, we could key it even more tightly to module qualification, by making the lexical rule exactly the same: just as M.x is treated as binding super-tightly, so is r.x.   If you like, M.x is a lexeme, and so is r.x.  They even have the same connotation (the x component of M or r respectively).

What about (f e).y, or f{-blah-}.y?  Well, you can’t write qualified modules that way; it becomes two or more lexemes.  I’d be perfectly content to do the same for record selections, using Joachim’s rule 5 for every case *except* the case that parses exactly like module qualifiers (modulo upper case vs lower case).

TL;DR: I’m arguing that (f r.x) means (f (r.x)) just as (f M.x) means (f (M.x)).

But I don’t care about
               f (blah blah blah).x
I’d be quite content with that meaning
               f (bla blha blah) .x
as Joachim suggests.

Would that help to resolve this debate?   I had not thought of it in that way before, but Joachim and Richard have helped met to do so, thank you.

Simon

From: ghc-steering-committee <ghc-steering-committee-bounces at haskell.org> On Behalf Of Richard Eisenberg
Sent: 10 February 2020 12:14
To: Simon Marlow <marlowsd at gmail.com>
Cc: ghc-steering-committee <ghc-steering-committee at haskell.org>; Joachim Breitner <mail at joachim-breitner.de>
Subject: Re: [ghc-steering-committee] Record dot notation

Upon careful consideration, I think the whitespace concerns here are somewhat ill-founded.

First, please see https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0229-whitespace-bang-patterns.rst#proposed-change-specification, where (among other points), a careful description of "loose infix" vs "prefix" vs "suffix" vs "tight infix" is discussed. Here is a set of examples:

a ! b   -- a loose infix occurrence

a!b     -- a tight infix occurrence

a !b    -- a prefix occurrence

a! b    -- a suffix occurrence
This distinction is *not* just made by example, but that proposal (which has been accepted) defines these precisely. So, the comments on this thread about what counts as a naked selector are addressed: a naked selector is one where the dot is a prefix occurrence.

Other whitespace-wariness comes from worrying about the distinction between prefix and tight infix occurrences. That is, should we differentiate between the interpretation of `f r.x` and `f r .x`. Yet in all versions of any of this, we differentiate between loose infix and the others. Thus there is *always* whitespace-sensitivity around dot. Note that this is true, as Simon PJ pointed out, regardless of this proposal, where a tight-infix usage of a dot with a capitalized identifier on the left is taken as a module qualification. In all of its versions, this proposal *increases* the whitespace sensitivity, by further distinguishing between prefix occurrences of dot and other usages.

Let's compare options 3 and 5 with this analysis then:

Option 3:
loose-infix: whatever (.) is in scope
tight-infix:
  - if left-hand is a capitalized identifier: module qualification
  - otherwise: record selection, binding tighter than function application
prefix: postfix record selection, binding like function application
suffix: presumably, whatever (.) is in scope

Option 5:
loose-infix: whatever (.) is in scope
tight-infix:
 - if left-hand is a capitalized identifier: module qualification
 - otherwise: postfix record selection, binding like function application
prefix: postfix record selection, binding like function application
suffix: presumably, whatever (.) is in scope

My point here is that option (5) is no more or less whitespace sensitive than option (3). Both need the same cases to figure what the period character in your code means. I think this is why Simon PJ has keyed this part of the debate to module qualification: that existing feature (not under debate) essentially breaks the symmetry here, meaning that we have more room to work with without breaking symmetry further.

My vote is thus:

3 > 5 > 2 > 4 > 1

Other points of motivation:
- Despite my argument above, I see the merit in (5). I just think that an argument "we don't want dot to be whitespace-sensitive" isn't really effective.
- I want to accept this proposal. We're not going to get another go at this.
- I really don't like the way record-update binds, and (4) reminds me too much of that.

Richard


On Feb 10, 2020, at 9:58 AM, Simon Marlow <marlowsd at gmail.com<mailto:marlowsd at gmail.com>> wrote:

On Fri, 7 Feb 2020 at 22:37, Joachim Breitner <mail at joachim-breitner.de<mailto:mail at joachim-breitner.de>> wrote:

I really would prefer a design where all these questions do not even
need to be asked…

Me too. Also what about (.x) vs. ( .x), are those the same?

So I think to have the full picture, we need the following option as
well on the ballot:

 5. .x is a postfix operator, binding exactly like application,
    whether it is naked or not.
    (This is option 3, but without the whitespace-sensitivity.)

[...]

Anyways, now for my opinion: Assuming no more options are added, my
ranking will be

  5 > 4 > 2 > 1 > 3

This puts first the two variants where .x behaves like an existing
language feature (either like function application or like record
updates), has no whitespace sensitivity, and follows existing languages
precedence (JS and OCaml, resp.).
Then the compromise solution that simply forbids putting spaces before
.x (so at least the program doesn't change semantics silently).
I dislike variant 3, which adds a _new_ special rule, and where adding
a single space can change the meaning of the program, so I rank that
last.

I'm also against whitespace-sensitivity and I lean towards this ordering too.
But I'm going with:

5 > 2 > 1 > 4 > 3

Rationale: (5) seems the easiest to explain and has the fewest special cases, yet covers the use-cases we're interested in. Beyond that I want to be conservative because I find it hard to predict the ramifications of the more-complex alternatives 4/3, so I've put 2/1 ahead of those. I've made my peace with the current record selection syntax binding more tightly than application, and indeed I often rely on it to avoid a $, so I'm OK with 4 over 3.

Cheers
Simon




Cheers,
Joachim


PS, because its on my mind, and just for fun:

Under variant 3, both foo1 and foo2 typecheck, they do quite different
things (well, one loops).

  data Stream a = Stream { val :: a, next :: Stream a }

  foo1 f s = Stream (s.val) (foo1 (fmap f s).next)
  foo2 f s = Stream (s.val) (foo2 (fmap f s) .next)


--
Joachim Breitner
  mail at joachim-breitner.de<mailto:mail at joachim-breitner.de>
  http://www.joachim-breitner.de/


_______________________________________________
ghc-steering-committee mailing list
ghc-steering-committee at haskell.org<mailto:ghc-steering-committee at haskell.org>
https://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-steering-committee
_______________________________________________
ghc-steering-committee mailing list
ghc-steering-committee at haskell.org<mailto:ghc-steering-committee at haskell.org>
https://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-steering-committee

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-steering-committee/attachments/20200210/ad9455a9/attachment-0001.html>


More information about the ghc-steering-committee mailing list