GHC AST Annotations

Alan & Kim Zimmerman alan.zimm at gmail.com
Sun Sep 7 08:30:07 UTC 2014


If this is done right it can enable this sort of thing :
http://www.davidchristiansen.dk/2014/09/06/pretty-printing-idris/


On Fri, Sep 5, 2014 at 5:11 PM, Alan & Kim Zimmerman <alan.zimm at gmail.com>
wrote:

> Hi Neil
>
> Thanks for the feedback.
>
> I am going to start putting together a proof of concept, aiming to
> identify what annotations are needed to roundtrip source.
>
> The first version will make use of the index into a separate structure
> scheme, so that it can be used with existing GHC ASTs. Hopefully the
> information gained will help in understanding what is needed for the
> changes to the future AST.
>
> The concept I will be working with is a pretty-printer, where relative
> spacing for the particular elements is derived from the initial SrcSpan
> information. Any new elements added or changed in the AST can then have
> only relative information, and the final render should honour the layout
> from the original.
>
> It may be possible to harmonise this with Chris Done's hindent package,
> which is a code-specific pretty printer for haskell-src-exts.
>
> Alan
>
>
> On Sat, Aug 30, 2014 at 11:18 PM, Neil Mitchell <ndmitchell at gmail.com>
> wrote:
>
>> Since Alan is trying to do something for HaRe that I want for HLint on
>> top of haskell-src-exts, he asked me for my opinions on the proposal.
>> There seem to be two approaches to take:
>>
>> * Add SrcSpan's throughout. The HSE approach of having a list of inner
>> source spans is nasty - the details of which source space goes where
>> is entirely undocumented and hard to discover. Even worse, for things
>> like instance, which may or may not have a where after, the number of
>> inner SrcSpan's changes. Simon's idea of hsdo_do_loc is much cleaner,
>> and easily extends to Maybe SrcSpan if the keyword is optional.
>>
>> * Having the annotation be a type parameter gives much greater
>> flexibility. In particular, it would let you mark certain nodes as
>> being added/deleted. However, since SrcSpan has an Int in it, you can
>> always pass around a separate IntMap and make the SrcSpan really be an
>> index into more detailed information. It's nasty, but only the people
>> who use it pay for it.
>>
>> Both approaches have disadvantages. You could always combine both
>> ideas, and have a SrcSpan and entirely separately an annotation (which
>> defaults to (), rather than SrcSpanInfo), but maybe that's too much
>> extra baggage on the AST.
>>
>> Thanks, Neil
>>
>>
>> On Sat, Aug 30, 2014 at 3:32 PM, Alan & Kim Zimmerman
>> <alan.zimm at gmail.com> wrote:
>> > A further use case would be to be able to convert all the locations to
>> be
>> > relative, or include a relative portion, so that as tools manipulate
>> the AST
>> > by adding or removing parts the layout can be preserved.
>> >
>> > I think I may need to make a wip branch for this and experiment, it is
>> > always easier to comment on concrete things.
>> >
>> > Alan
>> >
>> >
>> > On Thu, Aug 28, 2014 at 10:38 PM, Simon Peyton Jones <
>> simonpj at microsoft.com>
>> > wrote:
>> >>
>> >> I thiink the key question is whether it is acceptable to sprinkle this
>> >> kind of information throughout the AST. For someone interested in
>> >> source-to-source conversions (like me) this is great, others may find
>> it
>> >> intrusive.
>> >>
>> >> It’s probably not too bad if you use record syntax; thus
>> >>
>> >>   | HsDo  { hsdo_do_loc :: SrcSpan              -- of the word "do"
>> >>
>> >>           , hsdo_blocks :: BlockSrcSpans
>> >>
>> >>           , hsdo_ctxt   :: HsStmtContext Name
>> >>
>> >>           , hsdo_stmts  :: [ExprLStmt id]
>> >>
>> >>           , hsdo_type    :: PostTcType }
>> >>
>> >>
>> >>
>> >> Simon
>> >>
>> >>
>> >>
>> >> From: Alan & Kim Zimmerman [mailto:alan.zimm at gmail.com]
>> >> Sent: 28 August 2014 19:35
>> >> To: Richard Eisenberg
>> >> Cc: Simon Peyton Jones; ghc-devs at haskell.org
>> >> Subject: Re: GHC AST Annotations
>> >>
>> >>
>> >>
>> >> This does have the advantage of being explicit. I modelled the initial
>> >> proposal on HSE as a proven solution, and I think that they were
>> trying to
>> >> keep it non-invasive, to allow both an annotated and non-annoted AST.
>> >>
>> >> I thiink the key question is whether it is acceptable to sprinkle this
>> >> kind of information throughout the AST. For someone interested in
>> >> source-to-source conversions (like me) this is great, others may find
>> it
>> >> intrusive.
>> >>
>> >> The other question, which is probably orthogonal to this, is whether we
>> >> want the annotation to be a parameter to the AST, which allows it to be
>> >> overridden by various tools for various purposes, or fixed as in
>> Richard's
>> >> suggestion.
>> >>
>> >> A parameterised annotation allows the annotations to be manipulated via
>> >> something like for HSE:
>> >>
>> >>  -- |AST nodes are annotated, and this class allows manipulation of the
>> >> annotations.
>> >> class Functor ast => Annotated ast where
>> >>
>> >>    -- |Retrieve the annotation of an AST node.
>> >>   ann :: ast l -> l
>> >>
>> >>   -- |Change the annotation of an AST node. Note that only the
>> annotation
>> >> of the node itself is affected, and not
>> >>   --  the annotations of any child nodes. if all nodes in the AST tree
>> are
>> >> to be affected, use fmap.
>> >>
>> >>   amap :: (l -> l) -> ast l -> ast l
>> >>
>> >>
>> >>
>> >> Alan
>> >>
>> >>
>> >>
>> >> On Thu, Aug 28, 2014 at 7:11 PM, Richard Eisenberg <eir at cis.upenn.edu>
>> >> wrote:
>> >>
>> >> For what it's worth, my thought is not to use SrcSpanInfo (which, to
>> me,
>> >> is the wrong way to slice the abstraction) but instead to add SrcSpan
>> fields
>> >> to the relevant nodes. For example:
>> >>
>> >>   | HsDo        SrcSpan              -- of the word "do"
>> >>                 BlockSrcSpans
>> >>                 (HsStmtContext Name) -- The parameterisation is
>> >> unimportant
>> >>                                      -- because in this context we
>> never
>> >> use
>> >>                                      -- the PatGuard or ParStmt variant
>> >>                 [ExprLStmt id]       -- "do":one or more stmts
>> >>                 PostTcType           -- Type of the whole expression
>> >>
>> >> ...
>> >>
>> >> data BlockSrcSpans = LayoutBlock Int  -- the parameter is the
>> indentation
>> >> level
>> >>                                  ...  -- stuff to track the appearance
>> of
>> >> any semicolons
>> >>                    | BracesBlock ...  -- stuff to track the braces and
>> >> semicolons
>> >>
>> >>
>> >> The way I understand it, the SrcSpanInfo proposal means that we would
>> have
>> >> lots of empty SrcSpanInfos, no? Most interior nodes don't need one, I
>> think.
>> >>
>> >> Popping up a level, I do support the idea of including this info in the
>> >> AST.
>> >>
>> >> Richard
>> >>
>> >>
>> >> On Aug 28, 2014, at 11:54 AM, Simon Peyton Jones <
>> simonpj at microsoft.com>
>> >> wrote:
>> >>
>> >> > In general I’m fine with this direction of travel. Some specifics:
>> >> >
>> >> > ·        You’d have to be careful to document, for every data
>> >> > constructor in HsSyn, what the association between the [SrcSpan] in
>> the
>> >> > SrcSpanInfo and the “sub-entities”
>> >> > ·        Many of the sub-entities will have their own SrcSpanInfo
>> >> > wrapped around them, so there’s some unhelpful duplication. Maybe
>> you only
>> >> > want the SrcSpanInfo to list the [SrcSpan]s for the sub-entities
>> (like the
>> >> > syntactic keywords) that do not show up as children in the syntax
>> tree?
>> >> > Anyway do by all means create a GHC Trac wiki page to describe your
>> >> > proposed design, concretely.
>> >> >
>> >> > Simon
>> >> >
>> >> > From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of
>> Alan &
>> >> > Kim Zimmerman
>> >> > Sent: 28 August 2014 15:00
>> >> > To: ghc-devs at haskell.org
>> >> > Subject: GHC AST Annotations
>> >> >
>> >> > Now that the landmines have hopefully been cleared from the AST via
>> [1]
>> >> > I would like to propose changing the location information in the AST.
>> >> >
>> >> > Right now the locations of syntactic markers such as
>> do/let/where/in/of
>> >> > in the source are discarded from the AST, although they are retained
>> in the
>> >> > rich token stream.
>> >> >
>> >> > The haskell-src-exts package deals with this by means of using the
>> >> > SrcSpanInfo data type [2] which contains the SrcSpan as per the
>> current GHC
>> >> > Located type but also has a list of SrcSpan s for the  syntactic
>> markers,
>> >> > depending on the particular AST fragment being annotated.
>> >> >
>> >> > In addition, the annotation type is provided as a parameter to the
>> AST,
>> >> > so that it can be changed as required, see [3].
>> >> >
>> >> > The motivation for this change is then
>> >> >
>> >> > 1. Simplify the roundtripping and modification of source by
>> explicitly
>> >> > capturing the missing location information for the syntactic markers.
>> >> >
>> >> > 2. Allow the annotation to be a parameter so that it can be replaced
>> >> > with a different one in tools, for example HaRe would include the
>> tokens for
>> >> > the AST fragment leaves.
>> >> >
>> >> > 3. Aim for some level compatibility with haskell-src-exts so that
>> tools
>> >> > developed for it could be easily ported to GHC, for example
>> exactprint [4].
>> >> >
>> >> >
>> >> >
>> >> > I would like feedback as to whether this would be acceptable, or if
>> the
>> >> > same goals should be achieved a different way.
>> >> >
>> >> >
>> >> >
>> >> > Regards
>> >> >
>> >> >   Alan
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > [1] https://phabricator.haskell.org/D157
>> >> >
>> >> > [2]
>> >> >
>> http://hackage.haskell.org/package/haskell-src-exts-1.15.0.1/docs/Language-Haskell-Exts-SrcLoc.html#t:SrcSpanInfo
>> >> >
>> >> > [3]
>> >> >
>> http://hackage.haskell.org/package/haskell-src-exts-1.15.0.1/docs/Language-Haskell-Exts-Annotated-Syntax.html#t:Annotated
>> >> >
>> >> > [4]
>> >> >
>> http://hackage.haskell.org/package/haskell-src-exts-1.15.0.1/docs/Language-Haskell-Exts-Annotated-ExactPrint.html#v:exactPrint
>> >> >
>> >>
>> >> > _______________________________________________
>> >> > ghc-devs mailing list
>> >> > ghc-devs at haskell.org
>> >> > http://www.haskell.org/mailman/listinfo/ghc-devs
>> >>
>> >>
>> >
>> >
>> >
>> > _______________________________________________
>> > ghc-devs mailing list
>> > ghc-devs at haskell.org
>> > http://www.haskell.org/mailman/listinfo/ghc-devs
>> >
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140907/375dc8b8/attachment-0001.html>


More information about the ghc-devs mailing list