TTG: Handling Source Locations

Shayan Najd sh.najd at gmail.com
Tue Feb 12 10:19:33 UTC 2019


Hi Richard,

> [Richard:]
> It seems to me, though, that this move makes us *less typed*.
> [and]
> However, worse, we might forget to *add* a location when downstream functions expect one.

We had a more sophisticated version of TTG that could support the
ping-pong style (and hence typed tagging of locations), but it came at
the price of more complicated encoding [0].
We have decided to abandon the more typed variant since tracking
whether a node is located or not is inherently a dynamic/run-time
process, not a static/compile-time process:
there are some nodes that are generated in the process by the compiler
by an arbitrary logic (hard to encode by types), hence have no
location (in the source code).

The types `LHsExpr`, `LPat`, and the like will be deleted! It will be
all `HsExpr`, `Pat` and the like.
Baking-in, e.g. `LHsExpr` into `HsExpr`, was a mistake in the first
place: we were cheating using `Maybe` type anyway, when for example an
`LHsExpr` was forcibly required but we had only `HsExpr` and used
`noLoc`.

> Sorry for reopening something that has already been debated, but (unless I'm missing something) the current state of affairs seems like a potential wellspring of subtle bugs.

We were really careful about the refactoring. The new code aside, I
don't see how we can introduce any bugs by the refactoring of the old
code explained in the wiki.
About the new code, the convention is straightforward: anytime you
destruct an AST node, assume a wrapper node inside (add an extra
case), or use the smart constructors/pattern synonyms.

I'd be happy to rediscuss the design space here. It would be great to
have everyone fully on board as it is not a trivial change.

/Shayan

[0] https://github.com/shayan-najd/HsAST/blob/master/Paper.pdf

On Sat, 9 Feb 2019 at 17:19, Richard Eisenberg <rae at cs.brynmawr.edu> wrote:
>
> Hi devs,
>
> I just came across [TTG: Handling Source Locations], as I was poking around in RdrHsSyn and found wondrous things like (dL->L wiz waz) all over the place.
>
> General outline: https://ghc.haskell.org/trac/ghc/wiki/ImplementingTreesThatGrow/HandlingSourceLocations
> Phab diff: https://phabricator.haskell.org/D5036
> Trac ticket: https://ghc.haskell.org/trac/ghc/ticket/15495
> Commit: https://gitlab.haskell.org/ghc/ghc/commit/509d5be69c7507ba5d0a5f39ffd1613a59e73eea
>
> I see why this change is wanted and how the new version works.
>
> It seems to me, though, that this move makes us *less typed*. That is, it would be very easy (and disastrous) to forget to match on a location node. For example, I can now do this:
>
> > foo :: LPat p -> ...
> > foo (VarPat ...) = ...
>
> Note that I have declared that foo takes a located pat, but then I forgot to extract the location with dL. This would type-check, but it would fail. Previously, the type checker would ensure that I didn't forget to match on the L constructor. This error would get caught after some poking about, because foo just wouldn't work.
>
> However, worse, we might forget to *add* a location when downstream functions expect one. This would be harder to detect, for two reasons:
> 1. The problem is caught at deconstruction, and figuring out where an object was constructed can be quite hard.
> 2. The problem might silently cause trouble, because dL won't actually fail on a node missing a location -- it just gives noSrcSpan. So the problem would manifest as a subtle degradation in the quality of an error message, perhaps not caught until several patches (or years!) later.
>
> So I'm uncomfortable with this direction of travel.
>
> Has this aspect of this design been brought up before? I have to say I don't have a great solution to suggest. Perhaps the best I can think of is to make Located a type family. It would branch on the type index to HsSyn types, introducing a Located node for GhcPass but not for other types. This Isn't really all that extensible (I think) and it gives special status to GHC's usage of the AST. But it seems to solve the immediate problems without the downside above.
>
> Sorry for reopening something that has already been debated, but (unless I'm missing something) the current state of affairs seems like a potential wellspring of subtle bugs.
>
> Thanks,
> Richard
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


More information about the ghc-devs mailing list