[Haskell-cafe] Some thoughts on Type-Directed Name Resolution
Steve Horne
sh006d3592 at blueyonder.co.uk
Wed Feb 8 15:16:37 CET 2012
On 07/02/2012 22:56, Richard O'Keefe wrote:
> On 8/02/2012, at 2:11 AM, Steve Horne wrote:
>
>> To be fair, "field OF record" isn't bad in that sense. However, it would defeat the purpose of TDNR - the record isn't first, and therefore cannot be used (given a left-to-right typing direction) as a context to offer member name suggestions.
> Yes, but why SHOULD there be a specific typing direction?
> ML manages perfectly fine without it.
For the only reason that any language feature should exist - because it
is useful. In any language with a rich library, it is useful to get
hints as to which names are available in a particular context. It saves
on the need to memorize thousands - sometimes tens or even hundreds of
thousands - of context-sensitive names and their spellings, and saves on
getting distracted needing to hunt through manuals.
> - #1;
> stdIn:1.1-1.3 Error: unresolved flex record
> (can't tell what fields there are besides #1)
> - #1 (true,3);
> val it = true : bool
> - #1 (42,"stuff",false);
> val it = 42 : int
>
> If a right-to-left "typing direction" works well for #field record
> in one language with constrained Hindley-Milner types, why would it
> not work well for field¶ record in another language with constrained
> Hindley-Milner types?
Parsers don't need to care much about left-to-right vs. right-to-left.
There can be stack size issues in principle, but that hasn't stopped
Haskell offering both left-associative and right-associative infix
operators. The ordering has significance in certain ways in functional
languages WRT e.g. currying, but that isn't really relevant here. In any
case, currying is left-to-right anyway - the left-most argument is
curried first.
The point here is for intellisense-like features to work effectively in
text editors. The context must come to the left for that to work because...
1. Searching for all possible names within a particular context is
easier, and generally more likely to be what is needed, than
searching for all possible contexts that contain a particular name.
2. It's easier to type the context, then the marker, then select/type
the name than it is to type the marker then the context, then cursor
back to before the marker, *then* select the name, then cursor back
to after the context.
> Why sacrifice readability (field name precedes record) for the sake
> of, well, for the sake of what exactly escapes me.
It doesn't sacrifice readability. The left-to-right order has been
chosen by most programming languages, and also used in many other
contexts, because many people find it very natural to start from the
most general and step down to the more specific in a left-to-right
direction. For example, <chapter>.<section>.<subsection>, or
<foldername>/<foldername>/<filename>.
The left-to-right order isn't especially important in general - but for
intellisense it is.
>> Also, even when I used COBOL (late eightees, early nineties) I'm pretty sure it supported "record.field".
> That certainly wasn't the case up to COBOL-85. I don't have a copy of COBOL 2002,
> so I can't speak for that, but COBOL 74 and COBOL 85 are the only candidates for those
> dates, and they definitely did NOT support record.field. Since '.' is the statement
> terminator in COBOL, it's intrinsically unlikely.
> (You did *check* a COBOL syntax summary, easily found on the web, before posting? Which?)
If I checked, I wouldn't have said "pretty sure" would I? Those words
are generally acknowledged as indicating that someone is working from
fallible memory.
That said, I did take a look in an old COBOL book. I didn't find either
the dot or the "OF". I vaguely remember that the original COBOL textbook
I had actually considered the SORT command so intrinsically difficult
that it was outside of the scope of the book. For various reasons, I
find it difficult to take COBOL seriously at all, though I wish I had
kept that original textbook that didn't cover SORT - it was a much
better joke than the book I kept.
>> On the more general point of choosing an alternative operator, I agree to a point, but familiarity does count for something. Others will point out that Haskell dares to be different, but it's possible to be too daring and too different. Being different for the sake of being different is for those teenagers who go on about being "random" and whatever else they go on about these days. The success of languages like Java, C# and C++ is based on familiarity.
> Using pointy brackets for generic parameters and :: for name scope were not familiar
> when C++ introduced them. And there was prior art in other languages for *both* of those.
>
> One common prior practice, relevantly enough, was '.' for name scope.
Yes, but C++ also dares to be different, and there's a principled reason
for having multiple selection operators in C++. There are multiple
namespaces involved. For a smart pointer, for example, . and -> access
different namespaces. :: accesses a different namespace too - containing
the mostly the same names, but treating them a different way so that
arguably you access different things via those names (or perhaps
different aspects of the same things).
Haskell already has a . for selecting a name through a context - we call
that context a module. According to Bertrand Meyer of Eiffel fame, a
class is both a module and a type. Therefore, logically, a type can be a
module. "Module" can be considered a general term for a container of
named items, basically. But I've already raised the point about simply
deciding that record types and record values are modules before.
>> I think Haskell should dare to be different when there's a point to that - where necessary based on a principle. We have type classes rather than OOP classes for a principled reason. We have the IO monad rather than effectful functions for a principled reason.
> And if C++ can break with prior practice for a practical reason, Haskell can break with prior practice
> for the same reason: not breaking existing code, fitting into the existing language structure as well
> as practical.
On not breaking existing code, I'm with you. I once got beaten up rather
badly for arguing against PEP 238 (changing the semantics of integer
division in Python). Although Python seems to have survived, I still
believe it was the wrong thing to do.
It would be nice to have some lexical disambiguation in this case - I
might prefer some other spelling, so long as the context is on the left
and the name is on the right. I was going to propose "?", but that's
taken already for implicit parameters - which I don't know the first
thing about so I can't guess possible conflicts.
I even wondered about adapting the field update syntax so that it no
longer does updates, giving something like...
<value> { <fieldname> }
This might even be extended as a shorthand for tuples of fields. Perhaps
it's biggest problem are that it's suggestive of [] for array indexing,
which of course Haskell doesn't support, and that it's too close the the
record-update syntax for an expression that doesn't return a record, and
therefore maybe confusing.
One question, though, is... does using the dot in itself break existing
code?
Take my view about record types and values being modules and I'm not
convinced it does. Function composition has functions to the left (and
right, of course). Neither a record type nor a record value is a function.
Provided the dot has a record type or record value to the left and a
field name to the right, there shouldn't be any conflict with existing
function composition.
>> If we don't have traditional field-selection for a principled reason
> We don't have it because we don't need it. And we don't need it because traditional field selection
> serves two roles: *selecting* one field and *updating* one field. It's a poor way to handle the
> latter use case, because one often needs to update more than one field. It's not _that_ good for
> the former use case either, if you need to access more than two fields from the same record.
Actually, traditional field selection only selects the field. The update
is handled by an assignment operator - it really doesn't make sense to
try to tie those two operators together. Arguably, the field selection
doesn't even read the field - it returns a field identifying thingy
(possibly not the correct technical term) which will convert (causing
the read) later, when it's found to be an rvalue rather than an lvalue.
In principle, Haskell *could* try something similar, using first-class
values for field-ids that are statically typed for the record and field
types. For modelling imperative code in a pure functional way, it would
be possible to do something like this in Haskell now for use with the
State monad, though the notation wouldn't be so clean.
On accessing more than two fields at once, pattern matching isn't going
away.
On selecting one field for a read, one of the reasons for the proposal
is because quite a few people find that defining a function in a global
(or whole-module) scope is a poor way to handle field selection - in
particular because you can't have the same field name in several types
due to the naming conflict.
> In another functional language that I use, I've noticed what seems to me a marked increase in
> readability by switching _away_ from field selection to pattern matching.
Personally, I don't believe in absolutes, especially WRT readability.
Pattern matching is good, but using a case expression just to extract a
single field from a record is excessive, causing clutter. That's
presumably why we already have the existing functions for reading single
fields - but as already mentioned, these cause name-conflict and
namespace-pollution issues.
>
>> I think that principle is a very weak one. If names can be scoped to modules, to case expressions, to let expressions etc, why not to records? Of course there's a difference, but IMO it's not an important one.
> Nobody is arguing against names being scoped to records.
> The argument is against using dot for it because dot has too many other uses.
> We have already seen quite enough horribly confusing examples in this thread.
Ah - so we're possibly in violent agreement, at least on some points.
I very much dislike the one-sided dots, for example, irrespective of
whether they are dots or spelled some other way. I'd rather see...
<type> <op> <name> : expression for field access function
<value> <op> <name> : as above, but with the record argument curried in
Using "<type> <op> <name>" may seem verbose when we currently have
"<name>", but with a new record notation that doesn't provide
module-wide field access functions, field names themselves can be made
shorter - at the moment, they often need a type-name prefix anyway for
readability and namespace collision avoidance, just as a convention
instead of using an operator.
A new record notation would be needed, with the same semantics as the
existing one except not making the existing field access functions
available with a module-level scope. However, if one of the goals of
TDNR is to avoid field-name conflicts and namespace pollution, this is I
think unavoidable whatever approach is taken. The only alternative would
be to deprecate the existing field access functions completely, which is
I think unrealistic. A weaker deprecation of the existing notation might
work - warn if it's used, but allow the warning to be disabled, and
don't threaten to ever remove the notation) but the
I haven't given a lot of thought to updates. Partly, I suspect that the
existing field update notation would support an intellisense-like
feature anyway - treat the braces in much the same way as parens for
function calls in other languages.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20120208/4a60b4b2/attachment.htm>
More information about the Haskell-Cafe
mailing list