Records in Haskell

Thu Jan 12 23:27:20 CET 2012

On 12/01/2012, Greg Weber <greg at gregweber.info> wrote:
> I added this and your Control.Category.<<< to the wiki.

Thanks.

> I am not sure about the tuple proposal - tuples normally imply an ordering,
> which would imply that all record fields must be accounted for at least
> with an empty comma or an underscore, particularly if updating the last
> field in a record. For records we want a syntax where we can pick out one
> or many fields to update and ignore the rest.

Sorry, my proposal was unclear. This is not what I meant; rather, I
meant that one could write
let r.(x, y) = (x', y')
whatever other fields might be in r. I clarify further on the wiki.

That said, I notice now that this syntax is quite verbose, far more so
than the .{} syntax, which is a loss. I think the brevity worth the
added complexity.

> My feeling on <<< is that <~ is slightly more intuitive than <<< because it
> looks like an arrow which I equate with functions, and <<< is more
> difficult to parse because I have to recognize three in a row of the same
> character . However, if everyone likes using the unicode dot, then it
> doesn't matter what the non-unicode symbol is, and re-using existing
> symbols is certainly advantageous.
>
>
>
> On Thu, Jan 12, 2012 at 10:02 AM, Matthew Farkas-Dyck
> <strake888 at gmail.com>wrote:
>
>> On 09/01/2012, Isaac Dupree <ml at isaac.cedarswampstudios.org> wrote:
>> > You mean this wiki page, right?:
>> > http://hackage.haskell.org/trac/ghc/wiki/Records/NameSpacing
>> >
>> >> That is, there are no fundamental
>> >> objections to the implementation of this records implementation.
>> >
>> > I think that might be overly optimistic... I think there's a risk that
>> > SPJ finds an irritating complication to type inference & the rest of us
>> > aren't type-system-savvy enough to continue trying to guess at that :)
>> > But I think you're referring to whether we object to ad-hoc overloading
>> > of record field names (neither parametric nor class-based polymorphism),
>> > if no difficulties crop up.  Some of the concerns on
>> > http://www.haskell.org/haskellwiki/TypeDirectedNameResolution apply --
>> > I'm not sure to what extent, but address those concerns rather than
>> > require those people to repeat themselves again! :)
>> >
>> > (If we dodge all those obstacles, well, a better record system is
>> better!)
>> >
>> >
>> > Regardless, I think the proposal needs more precision, so I tried for
>> > syntax.  And got this thousand word email just about syntax, in several
>> > sections of reasoning, sorry!
>> >
>> > --so here are my conclusions/suggestions up front in case you prefer,
>> > most complicated first - look later for details in a different order,
>> > referenced by [number].
>> >
>> >
>> > Given that type inference for expr.field decides between several
>> > different possible meanings of "field", which "field"s in scope count as
>> > possibilities?  I concluded "All identifiers 'field' in top-level scope
>> > (i.e. declared in this module or imported unqualified from another
>> > module), but no local let/lambda bindings." [1]
>> >
>> > I have an unrelated record/module system tweak suggestion to prevent
>> > leaks of the field-ness of exports meant only as functions. [2]
>> >
>> > ".field", for any identifier "field", is a postfix operator that binds
>> > more tightly than function application. [1]
>> >
>> > I don't care whether the expression and ".field" are permitted to be
>> > separated by whitespace or not. [4]
>> >
>> > "(.field)" as a section: should it always require parentheses? (assuming
>> > there is a way to type-inference the unapplied dot operator usefully at
>> > all). [3]
>> >
>> > The Type.{...} / expr.{...} variations look probably fine ("Syntax for
>> > updates" in the wiki). [5]
>> >
>> > Allow "TyCon.field" for selecting a specific version of "field", but
>> > only if there isn't a module in scope as "TyCon" in which case it's
>> > interpreted as "Module.field". [7]
>> >
>> > Allow "expr.TyCon.field" / "expr.Module.field" wherever "expr.field" is
>> > allowed. [8]
>> >
>> > I bikesheddily suggest "field at expr" rather than "expr.field" (I'd be
>> > alright with either/both). [6]
>> >
>> >
>> >
>> >
>> > ===== Defining the basic dot syntax for record field selection =====
>> >
>> > [1]
>> > .x, for any identifier x, is a postfix operator that binds more tightly
>> > than function application.  This operator does not constrain the type of
>> > its operand or its result.  After regular type inference, the type
>> > system collects in-scope "x"s and does something to try to choose one.
>> >
>> > (As a non-type-system-expert, I'm not addressing what "does something"
>> > means in this email, besides capitalizing TYPE INFERENCE where I suspect
>> > there might (or might not) be extra problems.  Anyway, I wish the
>> > algorithm do the right thing for "expr.x.y.z".)
>> >
>> >
>> > But which in-scope "x"es does it collect?  Regular Haskell scope would
>> > completely break "let x = var.x" if we wished to allow "let x = var.x".
>> >
>> > How about: all record fields 'x' declared in this module or imported
>> > (qualified[??] or unqualified) from another module.
>> >
>> > [[[
>> > Should qualified ones be allowed?
>> > Pro: if you import Data.Map qualified, as is common (let's just pretend
>> > that Maps have members), then someMap.field works.
>> >
>> > Pro: it's like allowing qualified imports for instance declaration
>> > class-member-definitions.
>> >
>> > Con: it's not really like that. It makes qualified imports a weaker
>> > protection, as the class/instance exception can lead to no ambiguity,
>> > but this can lead to ambiguity.  The PVP would make a sad face.
>> >
>> > Con: using unqualified import with (..) would easily bring the field
>> > names into scope.  Fictitiously, "import qualified Data.Map as Map;
>> > import Data.Map(Map(..))".
>> >
>> > Observation: allowing qualified imports, but not following the
>> > class/instance system's style of including everything in the transitive
>> > closure of imported modules, still prevents you (Pro) from breaking
>> > intentional abstraction barriers, but (Con?) requires you to import the
>> > operators for types you receive but don't import.
>> >
>> > Opinion: only unqualified imports should be part of the selection
>> process.
>> > ]]]
>> >
>> > [[[
>> > Problem: Restricting the selection to only record fields further
>> > compromises an existing imperfect property of Haskell:
>> > module Library (Type, constructor, deconstructor) where
>> > data Type = Constructor { deconstructor :: Int }
>> > -- let's pretend it's a bounds-limited int or such.
>> > constructor int | int >= 3 && int < 17 = Constructor int
>> >
>> > Currently, importers of the module can observe that 'deconstructor' is a
>> > record-field by importing Library(Type(..)) and getting 'deconstructor'
>> > (see [2]--can we change that.).  This makes it slightly harder for the
>> > library implementer to change that name to a non-record-field.  In the
>> > proposal, the users might also have gotten used to "expr.deconstructor",
>> > and there would be no way to replace that syntax.
>> >
>> > Possible fix: also require all the type's data-constructors to be in
>> > scope.  I think that's too big a hammer though.  We could punt.  We
>> > could change the selection to "all top-level names 'x' declared in this
>> > module or imported unqualified from another module."
>> >
>> > Opinion: "all top-level names 'x' declared in this module or imported
>> > unqualified from another module." is better and not worse than
>> > restricting it to record-fields (provided that it does not burden the
>> > type inferencer complexity).
>> >
>> > Problem: Given that, it's annoying that you can't bind a
>> > record-field-ish-selector in a let/lambda-binding.
>> > Possible solution: allow "let .deconstruct = \x -> x - 100", possibly
>> > with sugary variations.  Possibly require a type-signature.
>> > Possibility: also allow ".deconstruct = " at top level.  (If the dot
>> > notation does funny things with TYPE INFERENCE, this might be a dubious
>> > idea.)
>> > Possible solution: consider *both* local bindings *and* the top-level
>> > names that they would normally shadow.  (That sounds rather odd; it
>> > might work since most of the local bindings with those names will be
>> > non-functions and thus not eligible; is it worth it?)
>> > Opinion: Just let it be annoying that you can't bind a
>> > record-field-ish-selector in a let/lambda-binding.  (You can't declare
>> > data or classes in let bindings either.  Any of these can be improved
>> > but I don't think we need to just to have a record system.)
>> >
>> > [2]
>> > We could make this leak fixable thus:
>> >
>> > "module Library (Type, deconstructor)"
>> > does not make "import Library(Type(..))" import "deconstructor"
>> > but
>> > "module Library (Type(deconstructor))" or "module Library (Type(..))"
>> > do make "import Library(Type(..))" import "deconstructor" (and make
>> > "import Library(Type(deconstructor))" valid).
>> >
>> > The difference could even extend to not making "deconstructor" usable in
>> > any record syntax (construction, pattern matching, and record update)
>> > unless it's imported from somewhere that it's exported syntactically
>> > under its type.
>> >
>> > This might make existing code break.
>> > Does anyone think this change might be a good idea?
>> > ]]]
>> >
>> > Suggestion: select all identifiers 'x' in top-level scope (i.e. declared
>> > in this module or imported unqualified from another module).
>> >
>> >
>> > ===== Miscellaneous dot-syntax observations =====
>> >
>> > [3]
>> > Observation: The point-free thought "b . .a" would be exactly the same
>> > as the "b . a" we have now with record fields -- except that it behaves
>> > a bit differently regarding scope/TYPE INFERENCE.  It implies that (.a)
>> > is a section of the unary postfix record-field-selection operator
>> > (similar to (1 +) being an operator section).  "b . .a" also suggests
>> > not requiring parentheses for that section when parsing precedence would
>> > not require it... for example, parsing precedence would require it in
>> > "map (.a) list" if "map .a list" meant "(map.a) list".
>> >
>> > [4]
>> > Should "identifier .field" be disallowed because it's almost certainly a
>> > mistake?  But
>> > "
>> >          (some long expression here)
>> >                   .field
>> > "
>> > probably isn't a mistake, so, shrugs.  I think it would be equally
>> > plausible to require the non-section version of dot to have no spaces on
>> > either side, or only require no-space on the right hand side of the dot.
>> >
>> > [5]
>> > The "Syntax for updates" from Frege in the wiki looks fine and
>> > syntactically unambiguous to me ("identifier.{" without spaces) - I'd
>> > want to think about it later but it seems unlikely to me to go terribly
>> > wrong.  When you say "If a::T then a.{x=} and a.{x=42} are valid", okay
>> > they're valid but as what? as "T.{x=} a" and "T.{x=42} a", or without
>> > the "a" argument? e.g. say
>> > "If a::T then a.{x=} and a.{x=42} are equivalent to (T.{x=} a) and
>> > (T.{x=42} a)".  Does TYPE INFERENCE have any more trouble with those
>> > than with regular field selection (I don't know)?
>>
>> On January 8th, 2012 CE, I wrote:
>> > Perhaps we could use let-syntax, thus:
>> > let { r.x = x'; r.y = y'; r.z = z'; } in r
>> >
>> > If we allow tuples of selectors, thus:
>> > r.(x, y, z) = (r.x, r.y, r.z)
>> > then one can simply write
>> > let r.(x, y, z) = (x', y', z') in r
>>
>> I once more propose this syntax (or the like).
>> Thus the language would be simpler (little/no new syntax to define),
>> and it would keep to the principle of Least Surprise (little/no new
>> syntax to learn).
>> I have not seen any comments on this – is there any consent? dissent?
>>
>> On 09/01/2012, Isaac Dupree <ml at isaac.cedarswampstudios.org> wrote:
>> > [6]
>> > If we want to bikeshed about what the operator should be ("." or other)
>> > :
>> > As
>> >
>> http://hackage.haskell.org/trac/haskell-prime/wiki/TypeDirectedNameResolution
>> > says about using dot, "It's standard practice, and that counts for a
>> lot."
>> > But if we want to bikeshed and look for an operator that orders things
>> > the same way as function application and composition...
>> > record.field
>> > vs
>> > field at record
>> > .
>> > "@" is already reserved.  It could bind more tightly than function
>> > application and type-inference differently.  Because it's reserved, it
>> > doesn't need to require no-spaces-around-it.
>> > Here's example from that TDNR page:
>> > typical OO: "x.f(3).g(v,w).h"
>> > how it would have to be in Haskell+TDNR-dot-syntax: "((x.f 3).g v w).h"
>> > or with @ instead of .: "h@(g@(f at x 3) v w)"
>> > Which is IMHO only minorly better... but then again, the Frege-ish
>> > record proposal might only be doing TDNR for records specifically.
>> > (more syntax and semantic discussion at
>> > http://www.haskell.org/haskellwiki/TypeDirectedNameResolution ).
>> >
>> > ...I kind of like @, but think it's a bikeshed: I claim that, if we work
>> > out the semantics (big "if"!), that not enough of us are going to say
>> > "it's better to have no records than dot-syntax records" [ditto for
>> > every other syntax] that the best decision would be "no records!".  If
>> > this becomes a popular bikeshed, we might just try straw-polling and
>> > picking what's popular (rather than get distracted and waste a year -- a
>> > common discussion outcome!).  (Though, feelings might be stronger than
>> > most bikesheds, for such a core language + syntax change...hmm.)
>> >
>> >
>> >
>> > ===== Module-related stuff =====
>> >
>> > It would be permitted to declare two records in the same scope with the
>> > same record field name.
>> >
>> > GHC would not warn if you shadow record field names with
>> > lambda/let-bound variables (It currently does, quite reasonably, warn,
>> > iff you enabled -fwarn-name-shadowing && not -XNamedFieldPuns.)  This,
>> > along with all the other changes, would be contingent on a new flag like
>> > -XNamedFieldDots.
>> >
>> > [7]
>> > We would add a syntax "TyCon.name".
>> >
>> > Thorough option for TyCon.name:
>> > (1) If there is a module in scope as TyCon containing a type TyCon with
>> > a field "name", it would choose that field.
>> > (2) Else if there is a module import in scope as TyCon containing an
>> > unqualified, unambiguous* identifier "name", it would choose that.
>> > *["unambiguous" meaning it doesn't have two different records with
>> > exported fields named "name".]
>> > (3) Else if there is a type-constructor in scope TyCon with a
>> > named-field "name", it would choose that.
>> > (4) Else it would be an error.
>> >
>> > I think rule (1) can be deleted without changing anything.
>> >
>> > People usually use module names with dots in them, and type-constructors
>> > cannot have dots in them.  If a single module decides to import one
>> > module "as" the exact name of a type imported from an entirely different
>> > module, and the two happen to have some of the same identifiers, perhaps
>> > it's okay for silliness to ensue.  In fact, given the unlikeliness of
>> > inconsistent overlaps like that, I suspect that:
>> >
>> > Simple option:
>> > (1) if there's a module in scope of that name, it means module scope
>> > even if that means the lookup fails
>> > (2) if there's not a module, see if it can be a type name
>> >
>> > is equally fine and better because it's simpler.  IMHO we shouldn't put
>> > these two rules in the other order because it has the ability to break
>> > existing code only for the benefit of something that hardly matters
>> > either way at all.
>> >
>> > None of these changes can break existing code.  The only breaking change
>> > that "-XNamedFieldDots" would introduce is a different meaning of a dot
>> > followed without spaces by a lowercase letter.
>> >
>> > [8]
>> > I suggest we should allow expr.TyCon.field (and expr.Module.field I
>> > guess) for field-selection too.  It's irritating when a syntax can't be
>> > qualified without rearranging things (to e.g. "(TyCon.field expr)" or
>> > e.g. "(expr::TyCon Int).field" [for single-parameter TyCons like
>> > Maybe]).
>> >
>> >
>> > There is no specific interaction with type-classes because Haskell
>> > type-classes do not behave like Frege type-classes (as best I can tell
>> > from this discussion).
>> >
>> >
>> > _______________________________________________
>> > Glasgow-haskell-users mailing list
>> > Glasgow-haskell-users at haskell.org
>> > http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
>> >
>>
>> _______________________________________________
>> Glasgow-haskell-users mailing list
>> Glasgow-haskell-users at haskell.org
>> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
>>
>