Operating on HsSyn

Shayan Najd sh.najd at gmail.com
Fri Jul 28 17:38:11 UTC 2017


by

 (parser . prettyPrint . parser) = id


I meant

 (prettyPrint . parser . prettyPrint) = id

for a valid input.


On Fri, Jul 28, 2017 at 4:32 PM, Shayan Najd <sh.najd at gmail.com> wrote:

> On the contrary, inside GHC I /do/ want to print them. Otherwise how can I
>> see what the renamer has done?
>
>
> Right. So if I understand correctly, with this semantics, `Outputable` is
> somewhere between pretty printing as often used in program manipulation
> libraries (like Haskell-Src-Exts (HSE)) which is closer to syntax, and
> `Show` which is closer to Haskell representation.
> (There are also "exact printers" (as in HSE) that are even closer to
> syntax in some sense.)
> Often, pretty printers generate only grammatically valid terms, not the
> ones polluted with extra annotations (hence grammatically invalid), e.g.,
> what is the grammatically valid form of `OpApp` printed via `Outputable`
> that includes the fixity annotation.
>
> If I recall correctly, we have briefly studied these in the past summer,
> we came up with some roundtrip correctness criteria, like the following
> (bar error handling; assume valid input):
>
>  (parser . prettyPrint . parser) = id
>
> [paging in Jacques]
>
> The reason I am trying to flesh out the semantics is the /big/ gains on
> code reuse later on in the process: one does not need to define a separate
> pretty printing library for Haskell syntax, and can reuse the well-tested
> and well-maintained one in GHC.
>
> Reformulating part of your concern, based on my understanding (if I may),
> the questions is: what is the proper design of an "outputer"
> (debug-printer?) where /annotated/ terms can be pretty-printed including
> any printable (pretty?showable?) annotations.
> In particular, we may want to take advantage of extensibility of data
> types for this.
> Am I far off?
>
> Note: with proper design, an extensible debug-printer can still subsume
> corresponding pretty-printers.
>
>
> On Fri, Jul 28, 2017 at 2:36 PM, Simon Peyton Jones <simonpj at microsoft.com
> > wrote:
>
>> I have been under the impression that we don't even want to print those.
>>
>> On the contrary, inside GHC I /do/ want to print them. Otherwise how can
>> I see what the renamer has done?
>>
>>
>>
>> Simon
>>
>>
>>
>> *From:* Shayan Najd [mailto:sh.najd at gmail.com]
>> *Sent:* 28 July 2017 12:20
>> *To:* Simon Peyton Jones <simonpj at microsoft.com>
>> *Cc:* ghc-devs at haskell.org; Alan & Kim Zimmerman <alan.zimm at gmail.com>
>> *Subject:* Re: Operating on HsSyn
>>
>>
>>
>> Before all this, we may need to discuss a bit about the intended
>> semantics of
>>
>> `Outputable`: does it need to print `PostRn`, or `PostTc` fields; or `Out`
>>
>> suffixed constructors?  If not, then we only need to write a set of
>> instances
>>
>> for the base growable AST, once and for all.  Such instances will be
>> polymorphic
>>
>> on the extension descriptor `p`, and do not need to mention the
>> constraints like
>>
>> `(PostRn p (IdP p)`, since these are just extensions and not part of the
>> base
>>
>> growable AST.  Or, am I missing something about the intended semantics of
>>
>> `Outputable`?
>>
>>
>>
>> You write
>>
>>
>>
>> So today we never print these annotations, to avoid bloating the instance
>> contexts, which can be painful.
>>
>>
>>
>> I have been under the impression that we don't even want to print those.
>>
>>
>>
>> Of course, there are scenarios (like `Show` instances) where we do want
>> to write
>>
>> compositional / generic functions that take into account the extensions.
>>
>> Here is my abstract overview of the scenario, that may help the
>> discussion.
>>
>> Consider data types `A`, `B`, and `C` (say, one AST datatype per compiler
>> phase) that
>>
>>  are defined as extensions to a base datatype `T`:
>>
>>
>>
>> > A = T XA
>>
>> > B = T XB
>>
>> > C = T XC
>>
>>
>>
>> where `X*`s are extension descriptors.
>>
>> Now, say we want to a define functions `f_A`, `f_B`, and `f_C` over `A`,
>> `B`, and `C`.
>>
>> We have two main alternatives:
>>
>> (a) either we write these  (manually or using the deriving mechanism)
>> separately
>>
>> (b) or we write a generic / parametric function `g` over `T`, and reuse
>> that to define `f_*`s
>>
>>
>>
>> Of course, (b) is preferable in theory , but not always possible or
>> preferable in practice.
>>
>> In which case, we can always resort to (a).
>>
>> The more varying are the definitions of `f_A`, `f_B`, and `f_C` the more
>> parametric should
>>
>> `g` get, as this is the case for any generic function.
>>
>>
>>
>> With a correct design, I believe, these are all independent of Trees that
>> Grow story itself:
>>
>> we are now not only trying to reuse data types, and functions agnostic
>> towards extensions
>>
>> (pretty printers in my view of their semantics), but also reuse functions
>> with parametric /
>>
>> varying behaviour with respect to extensions.
>>
>>
>>
>> /Shayan
>>
>>
>>
>>
>>
>>
>>
>> On Fri, Jul 28, 2017 at 10:18 AM, Simon Peyton Jones <
>> simonpj at microsoft.com> wrote:
>>
>> Devs,
>>
>>
>>
>> Shayan is working away on “Trees that grow”… do keep it on your radar:
>>
>>
>>
>> *To:* ghc-devs
>> *Sent:* 25 May 2017 23:49
>>
>> Do take a look at this:
>>
>>
>>
>> ·         We propose to re-engineer HsSyn itself.  This will touch a
>> *lot* of code.
>>
>> ·         But it’s very neat, and will bring big long-term advantages
>>
>> ·         And we can do it a bit at a time
>>
>>
>>
>> The wiki page https://ghc.haskell.org/trac/g
>> hc/wiki/ImplementingTreesThatGrow has the details.  It’s entirely an
>> internal change, not a change to GHC’s specification, so it’s independent
>> of the GHC proposals process.  But I’d value the opinion of other GHC devs
>>
>>
>>
>> Meanwhile I have a question. When pretty-printing HsSyn we often have a
>> situation like this:
>>
>>
>>
>>   data Pass = Parsed | Renamed | Typechecked
>>
>>
>>
>>   data HsExpr (p :: Pass) = HsVar (IdP p) | ....
>>
>>
>>
>>   type famliy IdP p where
>>
>>      IdP Parsed      = RdrName
>>
>>      IdP Renamed     = Name
>>
>>      IdP Typechecked = Id
>>
>>
>>
>>   instance (Outputable (IdP p)) => Outputable (HsExpr p) where
>>
>>      ppr (HsVar v) = ppr v
>>
>>
>>
>> The (ppr v) requires (Outputable (IdP p)), hence the context.
>>
>>
>>
>> Moreover, and more seriously, there are things we just can't pretty-print
>>
>> right now.  For example, HsExpr has this data constructor:
>>
>>
>>
>>   data HsExpr p = ...
>>
>>     | OpApp       (LHsExpr p)
>>
>>                   (LHsExpr p)
>>
>>                   (PostRn p Fixity)
>>
>>                   (LHsExpr p)
>>
>>
>>
>> To pretty-print the third argument, we'd need to add
>>
>>
>>
>>   instance (Outputable (IdP p),
>>
>>             Outputable (PostRn p Fixity))   -- New
>>
>>         => Outputable (HsExpr p) where
>>
>>      ppr (HsVar v) = ppr v
>>
>>
>>
>> and that gets onerous.  *So today we never print these annotations*, to
>> avoid bloating the instance contexts, which can be painful.  It bit me
>> yesterday.
>>
>>
>>
>> We have bitten that bullet for the Data class: look at
>> HsExtension.DataId, which abbreviates the long list of dictionaries:
>>
>>
>>
>>   type DataId p =
>>
>>     ( Data p
>>
>>     , ForallX Data p
>>
>>     , Data (NameOrRdrName (IdP p))
>>
>>     , Data (IdP p)
>>
>>     , Data (PostRn p (IdP p))
>>
>>     , Data (PostRn p (Located Name))
>>
>>     , Data (PostRn p Bool)
>>
>>     , Data (PostRn p Fixity)
>>
>>      ,..and nine more... )
>>
>>
>>
>> Let me note in passing that [wiki:QuantifiedContexts
>> <https://ghc.haskell.org/trac/ghc/wiki/QuantifiedContexts>] would make
>> this somewhat shorter
>>
>>
>>
>>   type DataId p =
>>
>>     ( Data p
>>
>>     , ForallX Data p
>>
>>     , Data (NameOrRdrName (IdP p))
>>
>>     , Data (IdP p)
>>
>>     , forall t. Data t => Data (PostRn p t))
>>
>>
>>
>> But we still need one item in this list for each type function,
>>
>> and I am worried about how this scales to the
>>
>> [wiki:ImplementingTreesThatGrow
>> <https://ghc.haskell.org/trac/ghc/wiki/ImplementingTreesThatGrow>]
>> story, when we have a type
>>
>> function for each data constructor -- and there are a *lot* of data
>>
>> constructors.
>>
>>
>>
>> *So I have four questions*
>>
>>
>>
>>    1. I think we should probably use a superclass instead of a type
>>    synonym
>>
>>
>>
>> class (Data p, ForallX Data p, ....) => DataId p where {}
>>
>>
>>
>> Why?  Only one argument to pass, and to pass on to successive calls.  I
>> see no downside.
>>
>>
>>
>>    1. Shall we treat Outputable like Data?  (I.e. make an abbreviation
>>    for a long list of Outputable instances and use it everywhere)
>>    2. I thought of another way to do it (pass a token); see below
>>
>>
>>
>>    1. Are there any other ways?
>>
>>
>>
>>
>>
>> *Token passing idea.*
>>
>>
>>
>> Perhaps instead of passing lots of functions, we pass a singleton token
>>
>> that encodes the pass, like this:
>>
>>
>>
>>   instance (PassC p) => Outputable (HsExpr p) where
>>
>>      ppr (HsVar v) = case getPass :: IsPass p of
>>
>>                        IsParsed      -> ppr v
>>
>>                        IsRenamed     -> ppr v
>>
>>                        IsTypechecked -> ppr v
>>
>>
>>
>> The three ppr's are at different types, of course; that's the point.
>>
>> The infrastructure is something like
>>
>>
>>
>>   class PassC p where
>>
>>     getPass :: IsPass p
>>
>>
>>
>>   data IsPass p where
>>
>>     IsParsed      :: IsPass Parsed
>>
>>     IsRenamed     :: IsParsed Renamed
>>
>>     IsTypechecked :: IsParsed Typechecked
>>
>>
>>
>>   instance PassC Parsed where getPass = IsParsed
>>
>>     ...etc...
>>
>>
>>
>> Now we could sweep away all those OutputableX classes,
>>
>> replacing them with dynamic tests on the singletons IsParsed etc.
>>
>> This would have advantages:
>>
>>
>>
>> - Probably faster: there's a dynamic test, but many fewer dictionary
>>
>>   arguments and higher-order function dispatch
>>
>>
>>
>> - Only one dictionary to pass; programming is easier.
>>
>>
>>
>> The big downside is that it's not extensible: it works only because
>>
>> we know the three cases.  But the "Trees that Grow" story really doesn't
>>
>> scale well to pretty-printing: so maybe we should just give up on that?
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20170728/5d9d35a7/attachment-0001.html>


More information about the ghc-devs mailing list