[Haskell-cafe] Some thoughts on Type-Directed Name Resolution

Thu Feb 2 03:42:19 CET 2012

 <quick <at> sparq.org> writes:

> 
> Fair deuce.  With all due respect now included, my same concern still seems 
to
> apply although I believe I poorly stated it originally.  Allow me to retry:
OK, thank you.

> 
>   By declaring partial application an invalid parse, it introduces an 
exception
> to point-free style that is at odds with the normal intuition of the uses 
of "f x".

I'm not (and I don't think any of the other proposals are) trying to declare 
partial application as an invalid parse. I'm saying that if you want to part-
apply function composition (in point-free style), you need to be careful with 
your syntax, because it's easily confused.

A piece of background which has perhaps been implicit in the discussions up to 
now. Currently under H98:
       f.g    -- (both lower case, no space around the dot)
Is taken as function composition -- same as (f . g).
       f.  g  -- is taken as func composition (f . g)
       f  .g  -- is taken as func composition (f . g)

I believe all three forms are deprecated these days, but as Donn points out 
there may be code still using it. Part of the reason for deprecating is the 
qualified name syntax, which _mustn't_ have dots. So:
       M.f     -- is qualified f from module M
       M. f    -- is dubious, did you mean M.f?
               -- or function composition (M . f)?
               -- with M being a data constructor
       M .f    -- similarly dubious between M.f vs (M . f)
The reason those are dubious is that it's relatively unusual to part-apply a 
data constructor in combination with function composition. More likely you've 
made a typo. Nevertheless, if that's what you want to do, be careful to code 
it as (M . f)

All the proposals in play are going to change the meaning of f.g. Some of the 
proposals (not mine) are going to change the meaning of f. and /or .g -- as 
Donn points out, any/all of these changes may break code. I say it's better to 
be conservative: reject f. and .g as invalid syntax. (If existing code has f.g 
as function composition, changing the meaning to field extraction is going to 
give a type failure, so it'll be 'in your face'.)

All proposals are saying that if you want to use dot as function composition 
you must always put the spaces round the dot (or at least between the dot and 
any name) -- even if you're part-applying. So:
      (f .)   -- part-apply function composition on f
      (. g)   -- part-apply function composition
{- as an exercise for the reader: what does that second one mean? How is it 
different to the first one? Give an example of use with functions head, tail 
and a list. -}

       (f.)   -- I say is ambiguous, did you mean (f .)
              -- or miss out something after the dot ?
       (.f)   -- I say is ambiguous, did you mean (. f)
              -- or miss out something before the dot ?

I'm saying that for both of the above, it's safer to treat them as an invalid 
parse, and require a space between the dot and the name.

> 
> SPJ's SOPR raises it as an issue and indicates he's inclined to disallow it; 
my
> concern above would still apply.

"SOPR"? SPJ's current proposal is abbreviated as "SORF" (Simple Overloaded 
Record Fields). His older proposal as "TDNR" (Type-Directed Name Resolution).
http://hackage.haskell.org/trac/ghc/wiki/Records

I don't think either of those disallow partial application of function 
composition. I do think they discuss how the syntax could be confusing, so 
require you to be careful.

Another piece of background which the discussion is probably not being 
explicit about (so thank you for forcing me to think through the explanation): 
under H98 record declarations
      data Customer = Customer { customer_id :: Int }
You immediately get a function:
      customer_id :: Customer -> Int
Then you can apply customer_id to a record, to extract the field. Because the 
type of customer_id is restricted to exactly one record type, this strengthens 
type inference. (Whatever customer_id is applied to must be type Customer, the 
result must be type Int.)

For my proposal, I'm trying very hard to be consistent with the H98 style, 
except to say that field extractor function f can apply to any record type 
(providing it has field f). Specifically, if the f field is always a String, 
we can help type inference. The type of f is (approximately speaking):
       f :: (Has r Proxy_f String) => r -> String
Or I prefer SPJ's suggested syntactic sugar:
       f :: r{ f :: String} => r -> String

But type inference for r is now harder: we have to gather information about r 
from the type environment where f is applied to r, enough to figure out which 
record type it is; then look up the instance declaration (generated from the 
data decl) to know how to extract the f field. That much isn't too hard. The 
really difficult part is how to do that in such a way that we can also update 
f to produce a new r, and cope with all the possible things f might be - 
including if f is polymorphic or higher-ranked.

(The "trying hard" is why my Record Update for Higher-ranked or Changing Types 
contained such ugly code.)
So I'm trying to support mixing H98 record fields with new-style poly-record 
field extractors. If you see in code:
       f r
(And you know already that r is a record and f is a field -- perhaps you're 
working in a database application). Then you know we're extracting a field 
from a record, whether it's a H98 record or a new-style record. Similarly:
       r.f   -- desugars to f r, so you know just as much

What's more, perhaps you've got new-style records in your module, but you're 
importing a H98 record definition from some other module. Then:
       customer_id customer  -- extracts the customer_id from the record
       customer.customer_id  -- means just the same
Wow!! we've just used dot-notation on H98-style records, and we didn't need to 
change any code at all in the imported module.

> 
> As I surely mis-understand it (referencing your proposal as RHCT since I 
haven't
> seen another reference):

You're right that there isn't a name for my proposal, and I definitely need 
one. (I take it "RHCT" comes from Record Update for Higher-ranked or Changing 
Types. Doesn't quite trip off the tongue, I'd say.) I'm thinking:

    "DORF" -- Declared Overloaded Record Fields
           -- The "ORF" part is similar to SPJ's SORF.
           -- The "Declared" means you have to declare a field,
           -- before using it in a data decl.
           -- Or the "D" might mean "Dictionary-based" as in data dictionary
           --   (not "dictionary" in the sense of "dictionary-passing")

In these examples you're giving, I assume recs is a list of records(?). I 
don't understand what you're doing with the "SOPR" items, so I've cut them.
> 
> ...

In the "RHCT" examples, I assume r is a record, f is a field (selector 
function) -- or is it 'just some function'?, rev_ is a field selector for a 
higher-ranked function (to reverse lists of arbitrary type), .$ is the 'fake' 
I used to simulate dot-as-field-selector. Thank you for reading all that so 
closely.

> RHCT:      map (\r -> f r) recs
is the same as:  map f recs                -- by eta reduction
so map f takes a list of records, returns a list of the f field from each
This also works under H98 record fields, with type enforcement that the 
records must be of the single type f comes from.

> RHCT:      map (\r -> r.$rev_ f) recs
Beware that (.$) is an operator, so it binds less tightly than function 
application, so it's a poor 'fake' syntactically. Did you mean .$ to simulate 
dot-notation to extract field rev_ from r? Then put:
             map (\r -> (r.$rev_) f) recs
This takes the Higher-Ranked reversing function from each record in recs, and 
(on the face of it) returns a list obtained by applying it to f. I've assumed 
above that f is a field selector function (or 'just some function'). So it's 
not a list. So you'll get a type error trying to apply (r.$rev_) to f.

If you meant to apply (r.$rev_) to the f field in r, put:
             map (\r -> (r.$rev_) (r.$f)) recs
For the type to work, this requires the f field to be a list. The map returns 
a list of reversed lists from the f field of each record.

> RHCT:      map ((.$)f) recs
If you mean this to return a list of the f fields from recs, put:
             map f recs
I don't know what else you could be trying to do.

> 
> If partial application is allowed (against SPJ's inclination and explicitly
> disallowed in your scheme), I could have:
> 
>     map .f recs

If you mean this to return a list of the f fields from recs, put:
DORF:          map f recs        -- are you beginning to see how easy this is?

I'm saying the ".f" should be rejected as too confusing.
(That is, under DORF aka RHCT. Under SORF or TDNR I'm not sure, which is why I 
don't like their proposals for dot notation, which is why I re-engineered it 
so that dot notation is tight-binding reverse function application **and 
nothing more**.)

I don't know what else you could be trying to do, but if you're trying to use 
dot as function composition (part-applied), put:
             map (. f) recs
But this won't extract an f field from recs (exercise for the reader).

> ... my intent was to attempt to assist in trying
> to clarify a what I perceived as a conceptual gap in the discussion.  I am 
most
> grateful for the significant time and effort contributed by yourself, SPJ, 
and
> all other parties, and I fear I've mostly wasted people's time on syntactic
> trivialities already well discussed and dismissed.  Please do carry on, it's 
all
> good stuff.
> 
> -KQ
> 

Thank you Kevin, we got there in the end. Your questions did help me clarify 
and explain what was implicit.

I think in general that syntax is trivial, but for one thing we've got very 
complex syntax already in Haskell. Our 'syntax engineering' has got to be 
careful to 'fit in', and not use up too many of the options that are still 
available.

What's special with dot syntax is it's well-established and with well-
established (range of) meanings in other programming paradigms. If we 
introduce dot-notation into Haskell, we have to try to make it behave like 
those paradigms, but in a 'Haskelly' way.

[To go a little off-topic/out of scope. My gold standard is 
polymorphic/anonymous records with concatenation, merge, projection, 
extension, everything you get in relational algebra. I don't want to use up 
all the design options just getting through the current namespace 
restrictions -- infuriating though they are.]

AntC