[Haskell-cafe] Some thoughts on Type-Directed Name Resolution

Wed Feb 8 15:16:37 CET 2012

On 07/02/2012 22:56, Richard O'Keefe wrote:
> On 8/02/2012, at 2:11 AM, Steve Horne wrote:
>
>> To be fair, "field OF record" isn't bad in that sense. However, it would defeat the purpose of TDNR - the record isn't first, and therefore cannot be used (given a left-to-right typing direction) as a context to offer member name suggestions.
> Yes, but why SHOULD there be a specific typing direction?
> ML manages perfectly fine without it.
For the only reason that any language feature should exist - because it 
is useful. In any language with a rich library, it is useful to get 
hints as to which names are available in a particular context. It saves 
on the need to memorize thousands - sometimes tens or even hundreds of 
thousands - of context-sensitive names and their spellings, and saves on 
getting distracted needing to hunt through manuals.

> - #1;
> stdIn:1.1-1.3 Error: unresolved flex record
>     (can't tell what fields there are besides #1)
> - #1 (true,3);
> val it = true : bool
> - #1 (42,"stuff",false);
> val it = 42 : int
>
> If a right-to-left "typing direction" works well for #field record
> in one language with constrained Hindley-Milner types, why would it
> not work well for field¶ record in another language with constrained
> Hindley-Milner types?
Parsers don't need to care much about left-to-right vs. right-to-left. 
There can be stack size issues in principle, but that hasn't stopped 
Haskell offering both left-associative and right-associative infix 
operators. The ordering has significance in certain ways in functional 
languages WRT e.g. currying, but that isn't really relevant here. In any 
case, currying is left-to-right anyway - the left-most argument is 
curried first.

The point here is for intellisense-like features to work effectively in 
text editors. The context must come to the left for that to work because...

 1. Searching for all possible names within a particular context is
    easier, and generally more likely to be what is needed, than
    searching for all possible contexts that contain a particular name.
 2. It's easier to type the context, then the marker, then select/type
    the name than it is to type the marker then the context, then cursor
    back to before the marker, *then* select the name, then cursor back
    to after the context.

> Why sacrifice readability (field name precedes record) for the sake
> of, well, for the sake of what exactly escapes me.
It doesn't sacrifice readability. The left-to-right order has been 
chosen by most programming languages, and also used in many other 
contexts, because many people find it very natural to start from the 
most general and step down to the more specific in a left-to-right 
direction. For example, <chapter>.<section>.<subsection>, or 
<foldername>/<foldername>/<filename>.

The left-to-right order isn't especially important in general - but for 
intellisense it is.

>> Also, even when I used COBOL (late eightees, early nineties) I'm pretty sure it supported "record.field".
> That certainly wasn't the case up to COBOL-85.  I don't have a copy of COBOL 2002,
> so I can't speak for that, but COBOL 74 and COBOL 85 are the only candidates for those
> dates, and they definitely did NOT support record.field.  Since '.' is the statement
> terminator in COBOL, it's intrinsically unlikely.
> (You did *check* a COBOL syntax summary, easily found on the web, before posting?  Which?)
If I checked, I wouldn't have said "pretty sure" would I? Those words 
are generally acknowledged as indicating that someone is working from 
fallible memory.

That said, I did take a look in an old COBOL book. I didn't find either 
the dot or the "OF". I vaguely remember that the original COBOL textbook 
I had actually considered the SORT command so intrinsically difficult 
that it was outside of the scope of the book. For various reasons, I 
find it difficult to take COBOL seriously at all, though I wish I had 
kept that original textbook that didn't cover SORT - it was a much 
better joke than the book I kept.
>> On the more general point of choosing an alternative operator, I agree to a point, but familiarity does count for something. Others will point out that Haskell dares to be different, but it's possible to be too daring and too different. Being different for the sake of being different is for those teenagers who go on about being "random" and whatever else they go on about these days. The success of languages like Java, C# and C++ is based on familiarity.
> Using pointy brackets for generic parameters and :: for name scope were not familiar
> when C++ introduced them.  And there was prior art in other languages for *both* of those.
>
> One common prior practice, relevantly enough, was '.' for name scope.
Yes, but C++ also dares to be different, and there's a principled reason 
for having multiple selection operators in C++. There are multiple 
namespaces involved. For a smart pointer, for example, . and -> access 
different namespaces. :: accesses a different namespace too - containing 
the mostly the same names, but treating them a different way so that 
arguably you access different things via those names (or perhaps 
different aspects of the same things).

Haskell already has a . for selecting a name through a context - we call 
that context a module. According to Bertrand Meyer of Eiffel fame, a 
class is both a module and a type. Therefore, logically, a type can be a 
module. "Module" can be considered a general term for a container of 
named items, basically. But I've already raised the point about simply 
deciding that record types and record values are modules before.

>> I think Haskell should dare to be different when there's a point to that - where necessary based on a principle. We have type classes rather than OOP classes for a principled reason. We have the IO monad rather than effectful functions for a principled reason.
> And if C++ can break with prior practice for a practical reason, Haskell can break with prior practice
> for the same reason:  not breaking existing code, fitting into the existing language structure as well
> as practical.
On not breaking existing code, I'm with you. I once got beaten up rather 
badly for arguing against PEP 238 (changing the semantics of integer 
division in Python). Although Python seems to have survived, I still 
believe it was the wrong thing to do.

It would be nice to have some lexical disambiguation in this case - I 
might prefer some other spelling, so long as the context is on the left 
and the name is on the right. I was going to propose "?", but that's 
taken already for implicit parameters - which I don't know the first 
thing about so I can't guess possible conflicts.

I even wondered about adapting the field update syntax so that it no 
longer does updates, giving something like...

<value> { <fieldname> }

This might even be extended as a shorthand for tuples of fields. Perhaps 
it's biggest problem are that it's suggestive of [] for array indexing, 
which of course Haskell doesn't support, and that it's too close the the 
record-update syntax for an expression that doesn't return a record, and 
therefore maybe confusing.

One question, though, is... does using the dot in itself break existing 
code?

Take my view about record types and values being modules and I'm not 
convinced it does. Function composition has functions to the left (and 
right, of course). Neither a record type nor a record value is a function.

Provided the dot has a record type or record value to the left and a 
field name to the right, there shouldn't be any conflict with existing 
function composition.

>> If we don't have traditional field-selection for a principled reason
> We don't have it because we don't need it.  And we don't need it because traditional field selection
> serves two roles:  *selecting* one field and *updating* one field.  It's a poor way to handle the
> latter use case, because one often needs to update more than one field.  It's not _that_ good for
> the former use case either, if you need to access more than two fields from the same record.
Actually, traditional field selection only selects the field. The update 
is handled by an assignment operator - it really doesn't make sense to 
try to tie those two operators together. Arguably, the field selection 
doesn't even read the field - it returns a field identifying thingy 
(possibly not the correct technical term) which will convert (causing 
the read) later, when it's found to be an rvalue rather than an lvalue.

In principle, Haskell *could* try something similar, using first-class 
values for field-ids that are statically typed for the record and field 
types. For modelling imperative code in a pure functional way, it would 
be possible to do something like this in Haskell now for use with the 
State monad, though the notation wouldn't be so clean.

On accessing more than two fields at once, pattern matching isn't going 
away.

On selecting one field for a read, one of the reasons for the proposal 
is because quite a few people find that defining a function in a global 
(or whole-module) scope is a poor way to handle field selection - in 
particular because you can't have the same field name in several types 
due to the naming conflict.

> In another functional language that I use, I've noticed what seems to me a marked increase in
> readability by switching _away_ from field selection to pattern matching.
Personally, I don't believe in absolutes, especially WRT readability. 
Pattern matching is good, but using a case expression just to extract a 
single field from a record is excessive, causing clutter. That's 
presumably why we already have the existing functions for reading single 
fields - but as already mentioned, these cause name-conflict and 
namespace-pollution issues.

>
>> I think that principle is a very weak one. If names can be scoped to modules, to case expressions, to let expressions etc, why not to records? Of course there's a difference, but IMO it's not an important one.
> Nobody is arguing against names being scoped to records.
> The argument is against using dot for it because dot has too many other uses.
> We have already seen quite enough horribly confusing examples in this thread.

Ah - so we're possibly in violent agreement, at least on some points.

I very much dislike the one-sided dots, for example, irrespective of 
whether they are dots or spelled some other way. I'd rather see...

<type> <op> <name> : expression for field access function
<value> <op> <name> : as above, but with the record argument curried in

Using "<type> <op> <name>" may seem verbose when we currently have 
"<name>", but with a new record notation that doesn't provide 
module-wide field access functions, field names themselves can be made 
shorter - at the moment, they often need a type-name prefix anyway for 
readability and namespace collision avoidance, just as a convention 
instead of using an operator.

A new record notation would be needed, with the same semantics as the 
existing one except not making the existing field access functions 
available with a module-level scope. However, if one of the goals of 
TDNR is to avoid field-name conflicts and namespace pollution, this is I 
think unavoidable whatever approach is taken. The only alternative would 
be to deprecate the existing field access functions completely, which is 
I think unrealistic. A weaker deprecation of the existing notation might 
work - warn if it's used, but allow the warning to be disabled, and 
don't threaten to ever remove the notation) but the

I haven't given a lot of thought to updates. Partly, I suspect that the 
existing field update notation would support an intellisense-like 
feature anyway - treat the braces in much the same way as parens for 
function calls in other languages.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20120208/4a60b4b2/attachment.htm>