GHC AST Annotations
Simon Peyton Jones
simonpj at microsoft.com
Thu Aug 28 20:38:45 UTC 2014
I thiink the key question is whether it is acceptable to sprinkle this kind of information throughout the AST. For someone interested in source-to-source conversions (like me) this is great, others may find it intrusive.
It’s probably not too bad if you use record syntax; thus
| HsDo { hsdo_do_loc :: SrcSpan -- of the word "do"
, hsdo_blocks :: BlockSrcSpans
, hsdo_ctxt :: HsStmtContext Name
, hsdo_stmts :: [ExprLStmt id]
, hsdo_type :: PostTcType }
Simon
From: Alan & Kim Zimmerman [mailto:alan.zimm at gmail.com]
Sent: 28 August 2014 19:35
To: Richard Eisenberg
Cc: Simon Peyton Jones; ghc-devs at haskell.org
Subject: Re: GHC AST Annotations
This does have the advantage of being explicit. I modelled the initial proposal on HSE as a proven solution, and I think that they were trying to keep it non-invasive, to allow both an annotated and non-annoted AST.
I thiink the key question is whether it is acceptable to sprinkle this kind of information throughout the AST. For someone interested in source-to-source conversions (like me) this is great, others may find it intrusive.
The other question, which is probably orthogonal to this, is whether we want the annotation to be a parameter to the AST, which allows it to be overridden by various tools for various purposes, or fixed as in Richard's suggestion.
A parameterised annotation allows the annotations to be manipulated via something like for HSE:
-- |AST nodes are annotated, and this class allows manipulation of the annotations.
class Functor ast => Annotated ast where
-- |Retrieve the annotation of an AST node.
ann :: ast l -> l
-- |Change the annotation of an AST node. Note that only the annotation of the node itself is affected, and not
-- the annotations of any child nodes. if all nodes in the AST tree are to be affected, use fmap.
amap :: (l -> l) -> ast l -> ast l
Alan
On Thu, Aug 28, 2014 at 7:11 PM, Richard Eisenberg <eir at cis.upenn.edu<mailto:eir at cis.upenn.edu>> wrote:
For what it's worth, my thought is not to use SrcSpanInfo (which, to me, is the wrong way to slice the abstraction) but instead to add SrcSpan fields to the relevant nodes. For example:
| HsDo SrcSpan -- of the word "do"
BlockSrcSpans
(HsStmtContext Name) -- The parameterisation is unimportant
-- because in this context we never use
-- the PatGuard or ParStmt variant
[ExprLStmt id] -- "do":one or more stmts
PostTcType -- Type of the whole expression
...
data BlockSrcSpans = LayoutBlock Int -- the parameter is the indentation level
... -- stuff to track the appearance of any semicolons
| BracesBlock ... -- stuff to track the braces and semicolons
The way I understand it, the SrcSpanInfo proposal means that we would have lots of empty SrcSpanInfos, no? Most interior nodes don't need one, I think.
Popping up a level, I do support the idea of including this info in the AST.
Richard
On Aug 28, 2014, at 11:54 AM, Simon Peyton Jones <simonpj at microsoft.com<mailto:simonpj at microsoft.com>> wrote:
> In general I’m fine with this direction of travel. Some specifics:
>
> · You’d have to be careful to document, for every data constructor in HsSyn, what the association between the [SrcSpan] in the SrcSpanInfo and the “sub-entities”
> · Many of the sub-entities will have their own SrcSpanInfo wrapped around them, so there’s some unhelpful duplication. Maybe you only want the SrcSpanInfo to list the [SrcSpan]s for the sub-entities (like the syntactic keywords) that do not show up as children in the syntax tree?
> Anyway do by all means create a GHC Trac wiki page to describe your proposed design, concretely.
>
> Simon
>
> From: ghc-devs [mailto:ghc-devs-bounces at haskell.org<mailto:ghc-devs-bounces at haskell.org>] On Behalf Of Alan & Kim Zimmerman
> Sent: 28 August 2014 15:00
> To: ghc-devs at haskell.org<mailto:ghc-devs at haskell.org>
> Subject: GHC AST Annotations
>
> Now that the landmines have hopefully been cleared from the AST via [1] I would like to propose changing the location information in the AST.
>
> Right now the locations of syntactic markers such as do/let/where/in/of in the source are discarded from the AST, although they are retained in the rich token stream.
>
> The haskell-src-exts package deals with this by means of using the SrcSpanInfo data type [2] which contains the SrcSpan as per the current GHC Located type but also has a list of SrcSpan s for the syntactic markers, depending on the particular AST fragment being annotated.
>
> In addition, the annotation type is provided as a parameter to the AST, so that it can be changed as required, see [3].
>
> The motivation for this change is then
>
> 1. Simplify the roundtripping and modification of source by explicitly capturing the missing location information for the syntactic markers.
>
> 2. Allow the annotation to be a parameter so that it can be replaced with a different one in tools, for example HaRe would include the tokens for the AST fragment leaves.
>
> 3. Aim for some level compatibility with haskell-src-exts so that tools developed for it could be easily ported to GHC, for example exactprint [4].
>
>
>
> I would like feedback as to whether this would be acceptable, or if the same goals should be achieved a different way.
>
>
>
> Regards
>
> Alan
>
>
>
>
> [1] https://phabricator.haskell.org/D157
>
> [2] http://hackage.haskell.org/package/haskell-src-exts-1.15.0.1/docs/Language-Haskell-Exts-SrcLoc.html#t:SrcSpanInfo
>
> [3] http://hackage.haskell.org/package/haskell-src-exts-1.15.0.1/docs/Language-Haskell-Exts-Annotated-Syntax.html#t:Annotated
>
> [4] http://hackage.haskell.org/package/haskell-src-exts-1.15.0.1/docs/Language-Haskell-Exts-Annotated-ExactPrint.html#v:exactPrint
>
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org<mailto:ghc-devs at haskell.org>
> http://www.haskell.org/mailman/listinfo/ghc-devs
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/ghc-devs/attachments/20140828/dc9b3740/attachment-0001.html>
More information about the ghc-devs
mailing list