Records in Haskell

Matthew Farkas-Dyck strake888 at gmail.com
Thu Jan 12 14:02:07 CET 2012


On 09/01/2012, Isaac Dupree <ml at isaac.cedarswampstudios.org> wrote:
> You mean this wiki page, right?:
> http://hackage.haskell.org/trac/ghc/wiki/Records/NameSpacing
>
>> That is, there are no fundamental
>> objections to the implementation of this records implementation.
>
> I think that might be overly optimistic... I think there's a risk that
> SPJ finds an irritating complication to type inference & the rest of us
> aren't type-system-savvy enough to continue trying to guess at that :)
> But I think you're referring to whether we object to ad-hoc overloading
> of record field names (neither parametric nor class-based polymorphism),
> if no difficulties crop up.  Some of the concerns on
> http://www.haskell.org/haskellwiki/TypeDirectedNameResolution apply --
> I'm not sure to what extent, but address those concerns rather than
> require those people to repeat themselves again! :)
>
> (If we dodge all those obstacles, well, a better record system is better!)
>
>
> Regardless, I think the proposal needs more precision, so I tried for
> syntax.  And got this thousand word email just about syntax, in several
> sections of reasoning, sorry!
>
> --so here are my conclusions/suggestions up front in case you prefer,
> most complicated first - look later for details in a different order,
> referenced by [number].
>
>
> Given that type inference for expr.field decides between several
> different possible meanings of "field", which "field"s in scope count as
> possibilities?  I concluded "All identifiers 'field' in top-level scope
> (i.e. declared in this module or imported unqualified from another
> module), but no local let/lambda bindings." [1]
>
> I have an unrelated record/module system tweak suggestion to prevent
> leaks of the field-ness of exports meant only as functions. [2]
>
> ".field", for any identifier "field", is a postfix operator that binds
> more tightly than function application. [1]
>
> I don't care whether the expression and ".field" are permitted to be
> separated by whitespace or not. [4]
>
> "(.field)" as a section: should it always require parentheses? (assuming
> there is a way to type-inference the unapplied dot operator usefully at
> all). [3]
>
> The Type.{...} / expr.{...} variations look probably fine ("Syntax for
> updates" in the wiki). [5]
>
> Allow "TyCon.field" for selecting a specific version of "field", but
> only if there isn't a module in scope as "TyCon" in which case it's
> interpreted as "Module.field". [7]
>
> Allow "expr.TyCon.field" / "expr.Module.field" wherever "expr.field" is
> allowed. [8]
>
> I bikesheddily suggest "field at expr" rather than "expr.field" (I'd be
> alright with either/both). [6]
>
>
>
>
> ===== Defining the basic dot syntax for record field selection =====
>
> [1]
> .x, for any identifier x, is a postfix operator that binds more tightly
> than function application.  This operator does not constrain the type of
> its operand or its result.  After regular type inference, the type
> system collects in-scope "x"s and does something to try to choose one.
>
> (As a non-type-system-expert, I'm not addressing what "does something"
> means in this email, besides capitalizing TYPE INFERENCE where I suspect
> there might (or might not) be extra problems.  Anyway, I wish the
> algorithm do the right thing for "expr.x.y.z".)
>
>
> But which in-scope "x"es does it collect?  Regular Haskell scope would
> completely break "let x = var.x" if we wished to allow "let x = var.x".
>
> How about: all record fields 'x' declared in this module or imported
> (qualified[??] or unqualified) from another module.
>
> [[[
> Should qualified ones be allowed?
> Pro: if you import Data.Map qualified, as is common (let's just pretend
> that Maps have members), then someMap.field works.
>
> Pro: it's like allowing qualified imports for instance declaration
> class-member-definitions.
>
> Con: it's not really like that. It makes qualified imports a weaker
> protection, as the class/instance exception can lead to no ambiguity,
> but this can lead to ambiguity.  The PVP would make a sad face.
>
> Con: using unqualified import with (..) would easily bring the field
> names into scope.  Fictitiously, "import qualified Data.Map as Map;
> import Data.Map(Map(..))".
>
> Observation: allowing qualified imports, but not following the
> class/instance system's style of including everything in the transitive
> closure of imported modules, still prevents you (Pro) from breaking
> intentional abstraction barriers, but (Con?) requires you to import the
> operators for types you receive but don't import.
>
> Opinion: only unqualified imports should be part of the selection process.
> ]]]
>
> [[[
> Problem: Restricting the selection to only record fields further
> compromises an existing imperfect property of Haskell:
> module Library (Type, constructor, deconstructor) where
> data Type = Constructor { deconstructor :: Int }
> -- let's pretend it's a bounds-limited int or such.
> constructor int | int >= 3 && int < 17 = Constructor int
>
> Currently, importers of the module can observe that 'deconstructor' is a
> record-field by importing Library(Type(..)) and getting 'deconstructor'
> (see [2]--can we change that.).  This makes it slightly harder for the
> library implementer to change that name to a non-record-field.  In the
> proposal, the users might also have gotten used to "expr.deconstructor",
> and there would be no way to replace that syntax.
>
> Possible fix: also require all the type's data-constructors to be in
> scope.  I think that's too big a hammer though.  We could punt.  We
> could change the selection to "all top-level names 'x' declared in this
> module or imported unqualified from another module."
>
> Opinion: "all top-level names 'x' declared in this module or imported
> unqualified from another module." is better and not worse than
> restricting it to record-fields (provided that it does not burden the
> type inferencer complexity).
>
> Problem: Given that, it's annoying that you can't bind a
> record-field-ish-selector in a let/lambda-binding.
> Possible solution: allow "let .deconstruct = \x -> x - 100", possibly
> with sugary variations.  Possibly require a type-signature.
> Possibility: also allow ".deconstruct = " at top level.  (If the dot
> notation does funny things with TYPE INFERENCE, this might be a dubious
> idea.)
> Possible solution: consider *both* local bindings *and* the top-level
> names that they would normally shadow.  (That sounds rather odd; it
> might work since most of the local bindings with those names will be
> non-functions and thus not eligible; is it worth it?)
> Opinion: Just let it be annoying that you can't bind a
> record-field-ish-selector in a let/lambda-binding.  (You can't declare
> data or classes in let bindings either.  Any of these can be improved
> but I don't think we need to just to have a record system.)
>
> [2]
> We could make this leak fixable thus:
>
> "module Library (Type, deconstructor)"
> does not make "import Library(Type(..))" import "deconstructor"
> but
> "module Library (Type(deconstructor))" or "module Library (Type(..))"
> do make "import Library(Type(..))" import "deconstructor" (and make
> "import Library(Type(deconstructor))" valid).
>
> The difference could even extend to not making "deconstructor" usable in
> any record syntax (construction, pattern matching, and record update)
> unless it's imported from somewhere that it's exported syntactically
> under its type.
>
> This might make existing code break.
> Does anyone think this change might be a good idea?
> ]]]
>
> Suggestion: select all identifiers 'x' in top-level scope (i.e. declared
> in this module or imported unqualified from another module).
>
>
> ===== Miscellaneous dot-syntax observations =====
>
> [3]
> Observation: The point-free thought "b . .a" would be exactly the same
> as the "b . a" we have now with record fields -- except that it behaves
> a bit differently regarding scope/TYPE INFERENCE.  It implies that (.a)
> is a section of the unary postfix record-field-selection operator
> (similar to (1 +) being an operator section).  "b . .a" also suggests
> not requiring parentheses for that section when parsing precedence would
> not require it... for example, parsing precedence would require it in
> "map (.a) list" if "map .a list" meant "(map.a) list".
>
> [4]
> Should "identifier .field" be disallowed because it's almost certainly a
> mistake?  But
> "
>          (some long expression here)
>                   .field
> "
> probably isn't a mistake, so, shrugs.  I think it would be equally
> plausible to require the non-section version of dot to have no spaces on
> either side, or only require no-space on the right hand side of the dot.
>
> [5]
> The "Syntax for updates" from Frege in the wiki looks fine and
> syntactically unambiguous to me ("identifier.{" without spaces) - I'd
> want to think about it later but it seems unlikely to me to go terribly
> wrong.  When you say "If a::T then a.{x=} and a.{x=42} are valid", okay
> they're valid but as what? as "T.{x=} a" and "T.{x=42} a", or without
> the "a" argument? e.g. say
> "If a::T then a.{x=} and a.{x=42} are equivalent to (T.{x=} a) and
> (T.{x=42} a)".  Does TYPE INFERENCE have any more trouble with those
> than with regular field selection (I don't know)?

On January 8th, 2012 CE, I wrote:
> Perhaps we could use let-syntax, thus:
> let { r.x = x'; r.y = y'; r.z = z'; } in r
>
> If we allow tuples of selectors, thus:
> r.(x, y, z) = (r.x, r.y, r.z)
> then one can simply write
> let r.(x, y, z) = (x', y', z') in r

I once more propose this syntax (or the like).
Thus the language would be simpler (little/no new syntax to define),
and it would keep to the principle of Least Surprise (little/no new
syntax to learn).
I have not seen any comments on this – is there any consent? dissent?

On 09/01/2012, Isaac Dupree <ml at isaac.cedarswampstudios.org> wrote:
> [6]
> If we want to bikeshed about what the operator should be ("." or other) :
> As
> http://hackage.haskell.org/trac/haskell-prime/wiki/TypeDirectedNameResolution
> says about using dot, "It's standard practice, and that counts for a lot."
> But if we want to bikeshed and look for an operator that orders things
> the same way as function application and composition...
> record.field
> vs
> field at record
> .
> "@" is already reserved.  It could bind more tightly than function
> application and type-inference differently.  Because it's reserved, it
> doesn't need to require no-spaces-around-it.
> Here's example from that TDNR page:
> typical OO: "x.f(3).g(v,w).h"
> how it would have to be in Haskell+TDNR-dot-syntax: "((x.f 3).g v w).h"
> or with @ instead of .: "h@(g@(f at x 3) v w)"
> Which is IMHO only minorly better... but then again, the Frege-ish
> record proposal might only be doing TDNR for records specifically.
> (more syntax and semantic discussion at
> http://www.haskell.org/haskellwiki/TypeDirectedNameResolution ).
>
> ...I kind of like @, but think it's a bikeshed: I claim that, if we work
> out the semantics (big "if"!), that not enough of us are going to say
> "it's better to have no records than dot-syntax records" [ditto for
> every other syntax] that the best decision would be "no records!".  If
> this becomes a popular bikeshed, we might just try straw-polling and
> picking what's popular (rather than get distracted and waste a year -- a
> common discussion outcome!).  (Though, feelings might be stronger than
> most bikesheds, for such a core language + syntax change...hmm.)
>
>
>
> ===== Module-related stuff =====
>
> It would be permitted to declare two records in the same scope with the
> same record field name.
>
> GHC would not warn if you shadow record field names with
> lambda/let-bound variables (It currently does, quite reasonably, warn,
> iff you enabled -fwarn-name-shadowing && not -XNamedFieldPuns.)  This,
> along with all the other changes, would be contingent on a new flag like
> -XNamedFieldDots.
>
> [7]
> We would add a syntax "TyCon.name".
>
> Thorough option for TyCon.name:
> (1) If there is a module in scope as TyCon containing a type TyCon with
> a field "name", it would choose that field.
> (2) Else if there is a module import in scope as TyCon containing an
> unqualified, unambiguous* identifier "name", it would choose that.
> *["unambiguous" meaning it doesn't have two different records with
> exported fields named "name".]
> (3) Else if there is a type-constructor in scope TyCon with a
> named-field "name", it would choose that.
> (4) Else it would be an error.
>
> I think rule (1) can be deleted without changing anything.
>
> People usually use module names with dots in them, and type-constructors
> cannot have dots in them.  If a single module decides to import one
> module "as" the exact name of a type imported from an entirely different
> module, and the two happen to have some of the same identifiers, perhaps
> it's okay for silliness to ensue.  In fact, given the unlikeliness of
> inconsistent overlaps like that, I suspect that:
>
> Simple option:
> (1) if there's a module in scope of that name, it means module scope
> even if that means the lookup fails
> (2) if there's not a module, see if it can be a type name
>
> is equally fine and better because it's simpler.  IMHO we shouldn't put
> these two rules in the other order because it has the ability to break
> existing code only for the benefit of something that hardly matters
> either way at all.
>
> None of these changes can break existing code.  The only breaking change
> that "-XNamedFieldDots" would introduce is a different meaning of a dot
> followed without spaces by a lowercase letter.
>
> [8]
> I suggest we should allow expr.TyCon.field (and expr.Module.field I
> guess) for field-selection too.  It's irritating when a syntax can't be
> qualified without rearranging things (to e.g. "(TyCon.field expr)" or
> e.g. "(expr::TyCon Int).field" [for single-parameter TyCons like Maybe]).
>
>
> There is no specific interaction with type-classes because Haskell
> type-classes do not behave like Frege type-classes (as best I can tell
> from this discussion).
>
>
> _______________________________________________
> Glasgow-haskell-users mailing list
> Glasgow-haskell-users at haskell.org
> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
>



More information about the Glasgow-haskell-users mailing list