<div dir="ltr"><div>Hello all,</div><div><br></div><div>I've been working on the HIE File (<a href="https://ghc.haskell.org/trac/ghc/wiki/HIEFiles">https://ghc.haskell.org/trac/ghc/wiki/HIEFiles</a>) GSOC project,</div><div><br></div><div>The design of the data structure as well as the traversal of GHCs ASTs to collect all the relevant info is mostly complete.</div><div><br></div><div>We traverse the Renamed and Typechecked AST to collect the following info about each SrcSpan</div><div><br></div><div>1) Its type, if it corresponds to a binding, pattern or expression</div><div>2) Details about any tokens in the original source corresponding to this span(keywords, symbols, etc.)</div><div>3) The set of Constructor/Type pairs that correspond to this span in the GHC AST</div><div>4) Details about all the identifiers that occur at this SrcSpan</div><div><br></div><div>For each occurrence of an identifier(Name or ModuleName), we store its type(if it has one), and classify it as one of the following based on how it occurs:</div><div><br></div><div>1) Use</div><div>2) Import/Export</div><div>3) Pattern Binding, along with the scope of the binding, and the span of the entire binding location(including the RHS) if it occurs as part of a top level declaration, do binding or let/where binding</div><div>4) Value Binding, along with whether it is an instance binding or not, its scope, and the span of its entire binding site, including the RHS</div><div>5) Type Declaration (class or regular) (foo :: ...)</div><div>6) Declaration(class, type, instance, data, type family etc.)</div><div>7) Type variable binding, along with its scope(which takes into account ScopedTypeVariables)</div><div><br></div><div>I have updated the wiki page with more details about the Scopes associated with bindings:
<a href="https://ghc.haskell.org/trac/ghc/wiki/HIEFiles#Scopeinformationaboutsymbols">https://ghc.haskell.org/trac/ghc/wiki/HIEFiles#Scopeinformationaboutsymbols</a><br></div><div><br></div><div>These annotated SrcSpans are then arranged into a interval/rose tree to aid lookups.<br></div><div><br></div><div>We assume that no SrcSpans ever partially overlap, for any two SrcSpans that occur in the Renamed/Typechecked ASTs, either they are equal, disjoint, or strictly contained in each other. This assumption has mostly held out so far while testing on the entire ghc:HEAD tree, other than one case where the typechecker strips out parenthesis in the original source, which has been patched(see <a href="https://ghc.haskell.org/trac/ghc/ticket/15242">https://ghc.haskell.org/trac/ghc/ticket/15242</a>).</div><div><br></div><div>I have also written functions that lookup the binding site(including RHS) and scope of an identifier from the tree. Testing these functions on the ghc:HEAD tree, it succeeds in looking up scopes for almost all symbol occurrences in all source files, and I've also verified that the calculated scope always contains all the occurrences of the symbol. The few cases where this check fails is where the SrcSpans have been mangled by CPP(see <a href="https://ghc.haskell.org/trac/ghc/ticket/15279">https://ghc.haskell.org/trac/ghc/ticket/15279</a>).</div><div><br></div><div>The code for this currently lives here: <a href="https://github.com/haskell/haddock/compare/ghc-head...wz1000:hiefile-2">https://github.com/haskell/haddock/compare/ghc-head...wz1000:hiefile-2</a></div><div><br></div><div>Moving forward, the plan for the rest of the summer is</div><div><br></div><div>1) Move this into the GHC tree and add a flag that controls generating this</div><div>2) Write serializers and deserializers for this info</div><div>3) Teach the GHC PackageDb about .hie files</div><div>4) Rewrite haddocks --hyperlinked-source to use .hie files.<br></div></div>