Discussion: Static Safety via Distinct Interfaces for HsSyn ASTs

Simon Peyton Jones simonpj at microsoft.com
Fri Aug 25 09:28:59 UTC 2017


Hmm I see. I still prefer the concrete form (no intermediate layer).   Many constructors use record fields, so you can just omit the ones that aren’t valid.

Simon


From: Shayan Najd [mailto:sh.najd at gmail.com]
Sent: 24 August 2017 15:39
To: Simon Peyton Jones <simonpj at microsoft.com>
Cc: Alan & Kim Zimmerman <alan.zimm at gmail.com>; ghc-devs at haskell.org
Subject: Re: Discussion: Static Safety via Distinct Interfaces for HsSyn ASTs

(1)-(3) appears to be three different approaches, but I don’t think that’s what you intend.  I think there are only two: add the indirection layer or not?

(1)-(3) are just steps when we do choose to add the indirection layer: add the layer, and do the changes when desired.
If we choose to not to add the indirection layer, nothing needs to be changed and the internals of the encoding (`PostTc`, place holders, etc) remain visible in the code.

Can you give an example function or two, and what it would look like under the different approaches.

For example, the function clause

> rnExpr (HsMultiIf _ty alts)
>  = do { (alts', fvs) <- mapFvRn (rnGRHS IfAlt rnLExpr) alts
>       ; return (HsMultiIf placeHolderType alts', fvs) }

becomes

> rnExpr (PsMultiIf alts)
>  = do { (alts', fvs) <- mapFvRn (rnGRHS IfAlt rnLExpr) alts
>       ; return (RnMultiIf alts', fvs) }

I hope it clarifies what I mean a bit.

There is always a choice between how distinct we want the phases to be.
The more distinct they are, the higher static guarantees. The code also gets more clear in a way, e.g. `RnMultiIf` is talking about a renamed expression, `PsMultiIf` about a parsed expression, while `HsMultiIf` is talking about an expression of any phase.
At the same, distinctness means more work for the programmer.
Also, such distinction sometimes implies a pedagogic burdon, as readers should now learn about more than one AST. However, this burden is very low here thanks to the prefixing convention: `PsMultiIf` and `RnMultiIf` are easily understood to represent the same thing in different phases.
Finally, such distinctions often lead to code duplication. But in our case, Trees that Grow machinery saves us from such duplication, e.g., we have the same base ASTs and we can write generic programmers over the bases ASTs anytime we want (point/step (3) above).

Thanks,
  Shayan

On Thu, Aug 24, 2017 at 3:35 PM, Simon Peyton Jones <simonpj at microsoft.com<mailto:simonpj at microsoft.com>> wrote:
I’m keen NOT to introduce these layers of indirection.  I think they make the code harder to understand.

Can you give an example function or two, and what it would look like under the different approaches.

(1)-(3) appears to be three different approaches, but I don’t think that’s what you intend.  I think there are only two: add the indirection layer or not?

S

From: Shayan Najd [mailto:sh.najd at gmail.com<mailto:sh.najd at gmail.com>]
Sent: 23 August 2017 13:26
To: ghc-devs at haskell.org<mailto:ghc-devs at haskell.org>
Cc: Simon Peyton Jones <simonpj at microsoft.com<mailto:simonpj at microsoft.com>>; Alan & Kim Zimmerman <alan.zimm at gmail.com<mailto:alan.zimm at gmail.com>>
Subject: Discussion: Static Safety via Distinct Interfaces for HsSyn ASTs

In this thread, I am going to raise a topic for discussion. Please share your opinions and related experiences.

Evaluation of type families within HsSyn ASTs, such as `PostTc`, with a fixed phase index, such as `GhcPs`, gives us distinct ASTs at the *compile-time*.
However, when programming with these ASTs, we use patterns, such as `HsMultiIf :: PostTc p Type -> [LGRHS p (LHsExpr p)] -> HsExpr p` that are shared among phases.
We can
(1) introduce a layer of abstraction providing a set of type and pattern synonyms specific to each phase, such as `PsMultiIf :: [LPsGRHS  LPsExpr] -> PsExpr`; and
(2) updating code working on ASTs of specific phase to use the interface specific to the phase, such as by changing prefixes from `Hs` to `Ps` and by removing unused variables and placeholders; and
(3) leaving untouched code working uniformly on ASTs of different phases (i.e., the generic functions in Trees that Grow terminology), such as the existing functions whose types are polymorphic on phase index.

Some comments:

- It can be done gradually and smoothly: we add three separate files in HsSyn (per each phase) containing the phase-specific interfaces, and gradually import them and do the changes per module.
- Using the interfaces is optional: code using the current method (e.g., using `HsMultiIf`) should work just fine.
- It introduces a layer of indirection and three more files to maintain.
- It makes code working on HsSyn ASTs, such as the renamer, appear cleaner as placeholders and similar machinery are abstracted away by the interfaces (e.g., no need to import bits and pieces of `HsExtension`)
- In theory, there should be zero impact on GHC's runtime performance.

I am myself undecided about its benefit-cost ratio, but willing to at least implement the phase-specific interfaces.
For me, abstracting away all the `PostRn` stuff, `Out` prefixed constructors, and dummy placeholders from the front-end code is the most valuable.

Yours,
  Shayan








-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20170825/451afc97/attachment-0001.html>


More information about the ghc-devs mailing list