Operating on HsSyn

Shayan Najd sh.najd at gmail.com
Fri Jul 28 14:32:13 UTC 2017


>
> On the contrary, inside GHC I /do/ want to print them. Otherwise how can I
> see what the renamer has done?


Right. So if I understand correctly, with this semantics, `Outputable` is
somewhere between pretty printing as often used in program manipulation
libraries (like Haskell-Src-Exts (HSE)) which is closer to syntax, and
`Show` which is closer to Haskell representation.
(There are also "exact printers" (as in HSE) that are even closer to syntax
in some sense.)
Often, pretty printers generate only grammatically valid terms, not the
ones polluted with extra annotations (hence grammatically invalid), e.g.,
what is the grammatically valid form of `OpApp` printed via `Outputable`
that includes the fixity annotation.

If I recall correctly, we have briefly studied these in the past summer, we
came up with some roundtrip correctness criteria, like the following (bar
error handling; assume valid input):

 (parser . prettyPrint . parser) = id

[paging in Jacques]

The reason I am trying to flesh out the semantics is the /big/ gains on
code reuse later on in the process: one does not need to define a separate
pretty printing library for Haskell syntax, and can reuse the well-tested
and well-maintained one in GHC.

Reformulating part of your concern, based on my understanding (if I may),
the questions is: what is the proper design of an "outputer"
(debug-printer?) where /annotated/ terms can be pretty-printed including
any printable (pretty?showable?) annotations.
In particular, we may want to take advantage of extensibility of data types
for this.
Am I far off?

Note: with proper design, an extensible debug-printer can still subsume
corresponding pretty-printers.


On Fri, Jul 28, 2017 at 2:36 PM, Simon Peyton Jones <simonpj at microsoft.com>
wrote:

> I have been under the impression that we don't even want to print those.
>
> On the contrary, inside GHC I /do/ want to print them. Otherwise how can I
> see what the renamer has done?
>
>
>
> Simon
>
>
>
> *From:* Shayan Najd [mailto:sh.najd at gmail.com]
> *Sent:* 28 July 2017 12:20
> *To:* Simon Peyton Jones <simonpj at microsoft.com>
> *Cc:* ghc-devs at haskell.org; Alan & Kim Zimmerman <alan.zimm at gmail.com>
> *Subject:* Re: Operating on HsSyn
>
>
>
> Before all this, we may need to discuss a bit about the intended semantics
> of
>
> `Outputable`: does it need to print `PostRn`, or `PostTc` fields; or `Out`
>
> suffixed constructors?  If not, then we only need to write a set of
> instances
>
> for the base growable AST, once and for all.  Such instances will be
> polymorphic
>
> on the extension descriptor `p`, and do not need to mention the
> constraints like
>
> `(PostRn p (IdP p)`, since these are just extensions and not part of the
> base
>
> growable AST.  Or, am I missing something about the intended semantics of
>
> `Outputable`?
>
>
>
> You write
>
>
>
> So today we never print these annotations, to avoid bloating the instance
> contexts, which can be painful.
>
>
>
> I have been under the impression that we don't even want to print those.
>
>
>
> Of course, there are scenarios (like `Show` instances) where we do want to
> write
>
> compositional / generic functions that take into account the extensions.
>
> Here is my abstract overview of the scenario, that may help the discussion.
>
> Consider data types `A`, `B`, and `C` (say, one AST datatype per compiler
> phase) that
>
>  are defined as extensions to a base datatype `T`:
>
>
>
> > A = T XA
>
> > B = T XB
>
> > C = T XC
>
>
>
> where `X*`s are extension descriptors.
>
> Now, say we want to a define functions `f_A`, `f_B`, and `f_C` over `A`, `
> B`, and `C`.
>
> We have two main alternatives:
>
> (a) either we write these  (manually or using the deriving mechanism)
> separately
>
> (b) or we write a generic / parametric function `g` over `T`, and reuse
> that to define `f_*`s
>
>
>
> Of course, (b) is preferable in theory , but not always possible or
> preferable in practice.
>
> In which case, we can always resort to (a).
>
> The more varying are the definitions of `f_A`, `f_B`, and `f_C` the more
> parametric should
>
> `g` get, as this is the case for any generic function.
>
>
>
> With a correct design, I believe, these are all independent of Trees that
> Grow story itself:
>
> we are now not only trying to reuse data types, and functions agnostic
> towards extensions
>
> (pretty printers in my view of their semantics), but also reuse functions
> with parametric /
>
> varying behaviour with respect to extensions.
>
>
>
> /Shayan
>
>
>
>
>
>
>
> On Fri, Jul 28, 2017 at 10:18 AM, Simon Peyton Jones <
> simonpj at microsoft.com> wrote:
>
> Devs,
>
>
>
> Shayan is working away on “Trees that grow”… do keep it on your radar:
>
>
>
> *To:* ghc-devs
> *Sent:* 25 May 2017 23:49
>
> Do take a look at this:
>
>
>
> ·         We propose to re-engineer HsSyn itself.  This will touch a *lot*
> of code.
>
> ·         But it’s very neat, and will bring big long-term advantages
>
> ·         And we can do it a bit at a time
>
>
>
> The wiki page https://ghc.haskell.org/trac/ghc/wiki/
> ImplementingTreesThatGrow has the details.  It’s entirely an internal
> change, not a change to GHC’s specification, so it’s independent of the GHC
> proposals process.  But I’d value the opinion of other GHC devs
>
>
>
> Meanwhile I have a question. When pretty-printing HsSyn we often have a
> situation like this:
>
>
>
>   data Pass = Parsed | Renamed | Typechecked
>
>
>
>   data HsExpr (p :: Pass) = HsVar (IdP p) | ....
>
>
>
>   type famliy IdP p where
>
>      IdP Parsed      = RdrName
>
>      IdP Renamed     = Name
>
>      IdP Typechecked = Id
>
>
>
>   instance (Outputable (IdP p)) => Outputable (HsExpr p) where
>
>      ppr (HsVar v) = ppr v
>
>
>
> The (ppr v) requires (Outputable (IdP p)), hence the context.
>
>
>
> Moreover, and more seriously, there are things we just can't pretty-print
>
> right now.  For example, HsExpr has this data constructor:
>
>
>
>   data HsExpr p = ...
>
>     | OpApp       (LHsExpr p)
>
>                   (LHsExpr p)
>
>                   (PostRn p Fixity)
>
>                   (LHsExpr p)
>
>
>
> To pretty-print the third argument, we'd need to add
>
>
>
>   instance (Outputable (IdP p),
>
>             Outputable (PostRn p Fixity))   -- New
>
>         => Outputable (HsExpr p) where
>
>      ppr (HsVar v) = ppr v
>
>
>
> and that gets onerous.  *So today we never print these annotations*, to
> avoid bloating the instance contexts, which can be painful.  It bit me
> yesterday.
>
>
>
> We have bitten that bullet for the Data class: look at HsExtension.DataId,
> which abbreviates the long list of dictionaries:
>
>
>
>   type DataId p =
>
>     ( Data p
>
>     , ForallX Data p
>
>     , Data (NameOrRdrName (IdP p))
>
>     , Data (IdP p)
>
>     , Data (PostRn p (IdP p))
>
>     , Data (PostRn p (Located Name))
>
>     , Data (PostRn p Bool)
>
>     , Data (PostRn p Fixity)
>
>      ,..and nine more... )
>
>
>
> Let me note in passing that [wiki:QuantifiedContexts
> <https://ghc.haskell.org/trac/ghc/wiki/QuantifiedContexts>] would make
> this somewhat shorter
>
>
>
>   type DataId p =
>
>     ( Data p
>
>     , ForallX Data p
>
>     , Data (NameOrRdrName (IdP p))
>
>     , Data (IdP p)
>
>     , forall t. Data t => Data (PostRn p t))
>
>
>
> But we still need one item in this list for each type function,
>
> and I am worried about how this scales to the
>
> [wiki:ImplementingTreesThatGrow
> <https://ghc.haskell.org/trac/ghc/wiki/ImplementingTreesThatGrow>] story,
> when we have a type
>
> function for each data constructor -- and there are a *lot* of data
>
> constructors.
>
>
>
> *So I have four questions*
>
>
>
>    1. I think we should probably use a superclass instead of a type
>    synonym
>
>
>
> class (Data p, ForallX Data p, ....) => DataId p where {}
>
>
>
> Why?  Only one argument to pass, and to pass on to successive calls.  I
> see no downside.
>
>
>
>    1. Shall we treat Outputable like Data?  (I.e. make an abbreviation
>    for a long list of Outputable instances and use it everywhere)
>    2. I thought of another way to do it (pass a token); see below
>
>
>
>    1. Are there any other ways?
>
>
>
>
>
> *Token passing idea.*
>
>
>
> Perhaps instead of passing lots of functions, we pass a singleton token
>
> that encodes the pass, like this:
>
>
>
>   instance (PassC p) => Outputable (HsExpr p) where
>
>      ppr (HsVar v) = case getPass :: IsPass p of
>
>                        IsParsed      -> ppr v
>
>                        IsRenamed     -> ppr v
>
>                        IsTypechecked -> ppr v
>
>
>
> The three ppr's are at different types, of course; that's the point.
>
> The infrastructure is something like
>
>
>
>   class PassC p where
>
>     getPass :: IsPass p
>
>
>
>   data IsPass p where
>
>     IsParsed      :: IsPass Parsed
>
>     IsRenamed     :: IsParsed Renamed
>
>     IsTypechecked :: IsParsed Typechecked
>
>
>
>   instance PassC Parsed where getPass = IsParsed
>
>     ...etc...
>
>
>
> Now we could sweep away all those OutputableX classes,
>
> replacing them with dynamic tests on the singletons IsParsed etc.
>
> This would have advantages:
>
>
>
> - Probably faster: there's a dynamic test, but many fewer dictionary
>
>   arguments and higher-order function dispatch
>
>
>
> - Only one dictionary to pass; programming is easier.
>
>
>
> The big downside is that it's not extensible: it works only because
>
> we know the three cases.  But the "Trees that Grow" story really doesn't
>
> scale well to pretty-printing: so maybe we should just give up on that?
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20170728/1af9b693/attachment-0001.html>


More information about the ghc-devs mailing list