Operating on HsSyn

Simon Peyton Jones simonpj at microsoft.com
Fri Jul 28 12:36:03 UTC 2017

I have been under the impression that we don't even want to print those.
On the contrary, inside GHC I /do/ want to print them. Otherwise how can I see what the renamer has done?


From: Shayan Najd [mailto:sh.najd at gmail.com]
Sent: 28 July 2017 12:20
To: Simon Peyton Jones <simonpj at microsoft.com>
Cc: ghc-devs at haskell.org; Alan & Kim Zimmerman <alan.zimm at gmail.com>
Subject: Re: Operating on HsSyn

Before all this, we may need to discuss a bit about the intended semantics of
`Outputable`: does it need to print `PostRn`, or `PostTc` fields; or `Out`
suffixed constructors?  If not, then we only need to write a set of instances
for the base growable AST, once and for all.  Such instances will be polymorphic
on the extension descriptor `p`, and do not need to mention the constraints like
`(PostRn p (IdP p)`, since these are just extensions and not part of the base
growable AST.  Or, am I missing something about the intended semantics of

You write

So today we never print these annotations, to avoid bloating the instance contexts, which can be painful.

I have been under the impression that we don't even want to print those.

Of course, there are scenarios (like `Show` instances) where we do want to write
compositional / generic functions that take into account the extensions.
Here is my abstract overview of the scenario, that may help the discussion.
Consider data types `A`, `B`, and `C` (say, one AST datatype per compiler phase) that
 are defined as extensions to a base datatype `T`:

> A = T XA
> B = T XB
> C = T XC

where `X*`s are extension descriptors.
Now, say we want to a define functions `f_A`, `f_B`, and `f_C` over `A`, `B`, and `C`.
We have two main alternatives:
(a) either we write these  (manually or using the deriving mechanism) separately
(b) or we write a generic / parametric function `g` over `T`, and reuse that to define `f_*`s

Of course, (b) is preferable in theory , but not always possible or preferable in practice.
In which case, we can always resort to (a).
The more varying are the definitions of `f_A`, `f_B`, and `f_C` the more parametric should
`g` get, as this is the case for any generic function.

With a correct design, I believe, these are all independent of Trees that Grow story itself:
we are now not only trying to reuse data types, and functions agnostic towards extensions
(pretty printers in my view of their semantics), but also reuse functions with parametric /
varying behaviour with respect to extensions.


On Fri, Jul 28, 2017 at 10:18 AM, Simon Peyton Jones <simonpj at microsoft.com<mailto:simonpj at microsoft.com>> wrote:

Shayan is working away on “Trees that grow”… do keep it on your radar:

To: ghc-devs
Sent: 25 May 2017 23:49
Do take a look at this:

•         We propose to re-engineer HsSyn itself.  This will touch a lot of code.

•         But it’s very neat, and will bring big long-term advantages

•         And we can do it a bit at a time

The wiki page https://ghc.haskell.org/trac/ghc/wiki/ImplementingTreesThatGrow has the details.  It’s entirely an internal change, not a change to GHC’s specification, so it’s independent of the GHC proposals process.  But I’d value the opinion of other GHC devs

Meanwhile I have a question. When pretty-printing HsSyn we often have a situation like this:

  data Pass = Parsed | Renamed | Typechecked

  data HsExpr (p :: Pass) = HsVar (IdP p) | ....

  type famliy IdP p where

     IdP Parsed      = RdrName

     IdP Renamed     = Name

     IdP Typechecked = Id

  instance (Outputable (IdP p)) => Outputable (HsExpr p) where

     ppr (HsVar v) = ppr v

The (ppr v) requires (Outputable (IdP p)), hence the context.

Moreover, and more seriously, there are things we just can't pretty-print
right now.  For example, HsExpr has this data constructor:

  data HsExpr p = ...

    | OpApp       (LHsExpr p)

                  (LHsExpr p)

                  (PostRn p Fixity)

                  (LHsExpr p)

To pretty-print the third argument, we'd need to add

  instance (Outputable (IdP p),

            Outputable (PostRn p Fixity))   -- New

        => Outputable (HsExpr p) where

     ppr (HsVar v) = ppr v

and that gets onerous.  So today we never print these annotations, to avoid bloating the instance contexts, which can be painful.  It bit me yesterday.

We have bitten that bullet for the Data class: look at HsExtension.DataId, which abbreviates the long list of dictionaries:

  type DataId p =

    ( Data p

    , ForallX Data p

    , Data (NameOrRdrName (IdP p))

    , Data (IdP p)

    , Data (PostRn p (IdP p))

    , Data (PostRn p (Located Name))

    , Data (PostRn p Bool)

    , Data (PostRn p Fixity)

     ,..and nine more... )

Let me note in passing that [wiki:QuantifiedContexts<https://ghc.haskell.org/trac/ghc/wiki/QuantifiedContexts>] would make this somewhat shorter

  type DataId p =

    ( Data p

    , ForallX Data p

    , Data (NameOrRdrName (IdP p))

    , Data (IdP p)

    , forall t. Data t => Data (PostRn p t))

But we still need one item in this list for each type function,
and I am worried about how this scales to the
[wiki:ImplementingTreesThatGrow<https://ghc.haskell.org/trac/ghc/wiki/ImplementingTreesThatGrow>] story, when we have a type
function for each data constructor -- and there are a lot of data

So I have four questions

  1.  I think we should probably use a superclass instead of a type synonym

class (Data p, ForallX Data p, ....) => DataId p where {}

Why?  Only one argument to pass, and to pass on to successive calls.  I see no downside.

  1.  Shall we treat Outputable like Data?  (I.e. make an abbreviation for a long list of Outputable instances and use it everywhere)
  2.  I thought of another way to do it (pass a token); see below

  1.  Are there any other ways?

Token passing idea.

Perhaps instead of passing lots of functions, we pass a singleton token
that encodes the pass, like this:

  instance (PassC p) => Outputable (HsExpr p) where

     ppr (HsVar v) = case getPass :: IsPass p of

                       IsParsed      -> ppr v

                       IsRenamed     -> ppr v

                       IsTypechecked -> ppr v

The three ppr's are at different types, of course; that's the point.
The infrastructure is something like

  class PassC p where

    getPass :: IsPass p

  data IsPass p where

    IsParsed      :: IsPass Parsed

    IsRenamed     :: IsParsed Renamed

    IsTypechecked :: IsParsed Typechecked

  instance PassC Parsed where getPass = IsParsed


Now we could sweep away all those OutputableX classes,
replacing them with dynamic tests on the singletons IsParsed etc.
This would have advantages:

- Probably faster: there's a dynamic test, but many fewer dictionary
  arguments and higher-order function dispatch

- Only one dictionary to pass; programming is easier.

The big downside is that it's not extensible: it works only because
we know the three cases.  But the "Trees that Grow" story really doesn't
scale well to pretty-printing: so maybe we should just give up on that?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20170728/9e654df8/attachment-0001.html>

More information about the ghc-devs mailing list