Operating on HsSyn

Shayan Najd sh.najd at gmail.com
Fri Jul 28 11:19:33 UTC 2017


Before all this, we may need to discuss a bit about the intended semantics
of
`Outputable`: does it need to print `PostRn`, or `PostTc` fields; or `Out`
suffixed constructors?  If not, then we only need to write a set of
instances
for the base growable AST, once and for all.  Such instances will be
polymorphic
on the extension descriptor `p`, and do not need to mention the constraints
like
`(PostRn p (IdP p)`, since these are just extensions and not part of the
base
growable AST.  Or, am I missing something about the intended semantics of
`Outputable`?

You write

So today we never print these annotations, to avoid bloating the instance
> contexts, which can be painful.
>

I have been under the impression that we don't even want to print those.

Of course, there are scenarios (like `Show` instances) where we do want to
write
compositional / generic functions that take into account the extensions.
Here is my abstract overview of the scenario, that may help the discussion.
Consider data types `A`, `B`, and `C` (say, one AST datatype per compiler
phase) that
 are defined as extensions to a base datatype `T`:

> A = T XA
> B = T XB
> C = T XC

where `X*`s are extension descriptors.
Now, say we want to a define functions `f_A`, `f_B`, and `f_C` over `A`, `B`,
and `C`.
We have two main alternatives:
(a) either we write these  (manually or using the deriving mechanism)
separately
(b) or we write a generic / parametric function `g` over `T`, and reuse
that to define `f_*`s

Of course, (b) is preferable in theory , but not always possible or
preferable in practice.
In which case, we can always resort to (a).
The more varying are the definitions of `f_A`, `f_B`, and `f_C` the more
parametric should
`g` get, as this is the case for any generic function.

With a correct design, I believe, these are all independent of Trees that
Grow story itself:
we are now not only trying to reuse data types, and functions agnostic
towards extensions
(pretty printers in my view of their semantics), but also reuse functions
with parametric /
varying behaviour with respect to extensions.

/Shayan



On Fri, Jul 28, 2017 at 10:18 AM, Simon Peyton Jones <simonpj at microsoft.com>
wrote:

> Devs,
>
>
>
> Shayan is working away on “Trees that grow”… do keep it on your radar:
>
>
>
> *To:* ghc-devs
> *Sent:* 25 May 2017 23:49
>
> Do take a look at this:
>
>
>
> ·         We propose to re-engineer HsSyn itself.  This will touch a *lot*
> of code.
>
> ·         But it’s very neat, and will bring big long-term advantages
>
> ·         And we can do it a bit at a time
>
>
>
> The wiki page https://ghc.haskell.org/trac/ghc/wiki/
> ImplementingTreesThatGrow has the details.  It’s entirely an internal
> change, not a change to GHC’s specification, so it’s independent of the GHC
> proposals process.  But I’d value the opinion of other GHC devs
>
>
>
> Meanwhile I have a question. When pretty-printing HsSyn we often have a
> situation like this:
>
>
>
>   data Pass = Parsed | Renamed | Typechecked
>
>
>
>   data HsExpr (p :: Pass) = HsVar (IdP p) | ....
>
>
>
>   type famliy IdP p where
>
>      IdP Parsed      = RdrName
>
>      IdP Renamed     = Name
>
>      IdP Typechecked = Id
>
>
>
>   instance (Outputable (IdP p)) => Outputable (HsExpr p) where
>
>      ppr (HsVar v) = ppr v
>
>
>
> The (ppr v) requires (Outputable (IdP p)), hence the context.
>
>
>
> Moreover, and more seriously, there are things we just can't pretty-print
>
> right now.  For example, HsExpr has this data constructor:
>
>
>
>   data HsExpr p = ...
>
>     | OpApp       (LHsExpr p)
>
>                   (LHsExpr p)
>
>                   (PostRn p Fixity)
>
>                   (LHsExpr p)
>
>
>
> To pretty-print the third argument, we'd need to add
>
>
>
>   instance (Outputable (IdP p),
>
>             Outputable (PostRn p Fixity))   -- New
>
>         => Outputable (HsExpr p) where
>
>      ppr (HsVar v) = ppr v
>
>
>
> and that gets onerous.  *So today we never print these annotations*, to
> avoid bloating the instance contexts, which can be painful.  It bit me
> yesterday.
>
>
>
> We have bitten that bullet for the Data class: look at HsExtension.DataId,
> which abbreviates the long list of dictionaries:
>
>
>
>   type DataId p =
>
>     ( Data p
>
>     , ForallX Data p
>
>     , Data (NameOrRdrName (IdP p))
>
>     , Data (IdP p)
>
>     , Data (PostRn p (IdP p))
>
>     , Data (PostRn p (Located Name))
>
>     , Data (PostRn p Bool)
>
>     , Data (PostRn p Fixity)
>
>      ,..and nine more... )
>
>
>
> Let me note in passing that [wiki:QuantifiedContexts
> <https://ghc.haskell.org/trac/ghc/wiki/QuantifiedContexts>] would make
> this somewhat shorter
>
>
>
>   type DataId p =
>
>     ( Data p
>
>     , ForallX Data p
>
>     , Data (NameOrRdrName (IdP p))
>
>     , Data (IdP p)
>
>     , forall t. Data t => Data (PostRn p t))
>
>
>
> But we still need one item in this list for each type function,
>
> and I am worried about how this scales to the
>
> [wiki:ImplementingTreesThatGrow
> <https://ghc.haskell.org/trac/ghc/wiki/ImplementingTreesThatGrow>] story,
> when we have a type
>
> function for each data constructor -- and there are a *lot* of data
>
> constructors.
>
>
>
> *So I have four questions*
>
>
>
>    1. I think we should probably use a superclass instead of a type
>    synonym
>
>
>
> class (Data p, ForallX Data p, ....) => DataId p where {}
>
>
>
> Why?  Only one argument to pass, and to pass on to successive calls.  I
> see no downside.
>
>
>
>    1. Shall we treat Outputable like Data?  (I.e. make an abbreviation
>    for a long list of Outputable instances and use it everywhere)
>
>    2. I thought of another way to do it (pass a token); see below
>
>
>
>    1. Are there any other ways?
>
>
>
>
>
> *Token passing idea.*
>
>
>
> Perhaps instead of passing lots of functions, we pass a singleton token
>
> that encodes the pass, like this:
>
>
>
>   instance (PassC p) => Outputable (HsExpr p) where
>
>      ppr (HsVar v) = case getPass :: IsPass p of
>
>                        IsParsed      -> ppr v
>
>                        IsRenamed     -> ppr v
>
>                        IsTypechecked -> ppr v
>
>
>
> The three ppr's are at different types, of course; that's the point.
>
> The infrastructure is something like
>
>
>
>   class PassC p where
>
>     getPass :: IsPass p
>
>
>
>   data IsPass p where
>
>     IsParsed      :: IsPass Parsed
>
>     IsRenamed     :: IsParsed Renamed
>
>     IsTypechecked :: IsParsed Typechecked
>
>
>
>   instance PassC Parsed where getPass = IsParsed
>
>     ...etc...
>
>
>
> Now we could sweep away all those OutputableX classes,
>
> replacing them with dynamic tests on the singletons IsParsed etc.
>
> This would have advantages:
>
>
>
> - Probably faster: there's a dynamic test, but many fewer dictionary
>
>   arguments and higher-order function dispatch
>
>
>
> - Only one dictionary to pass; programming is easier.
>
>
>
> The big downside is that it's not extensible: it works only because
>
> we know the three cases.  But the "Trees that Grow" story really doesn't
>
> scale well to pretty-printing: so maybe we should just give up on that?
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20170728/53d1dea6/attachment-0001.html>


More information about the ghc-devs mailing list