Interested to help with error messages
Alan & Kim Zimmerman
alan.zimm at gmail.com
Sat Jun 3 15:56:58 UTC 2017
Thanks Ben, a great summary. Is there a Wiki page for this? It feels like
it should be on one, so we can easily comment/update the individual points.
In terms of the pretty-printer and its string type. Perhaps we could
backpackify it to use http://next.hackage.haskell.org:8080/package/str-sig,
and then specialise the GHC version to FastString etc.
On 3 June 2017 at 17:50, Ben Gamari <ben at smart-cactus.org> wrote:
> * Alfredo Di Napoli for his on-going work in this area
> * Shivansh Rai for his interest in contributing
> * David Luposchainsky for his recent pretty-printer library
> * Richard Eisenberg due to his participation in #8809
> * Bartosz for his participation in #10735
> * Alan Zimmerman for his interest in Haskell tooling
> My apologies for the tome that follows. I have been thinking about this
> problem recently and think an overview of where we stand would be helpful.
> Siddharth Bhat <siddu.druid at gmail.com> writes:
> > Hello all,
> > Thanks for all the work that's put into GHC :)
> Thanks for your interest in helping!
> > I've tried to get into GHC development before, but I was unsuccessful,
> > mostly because I didn't dedicate enough time to understanding the problem
> > at hand & exploring the codebase.
> > I'd like to give it another shot. This time, I think I have a clear
> > of what I want to help with: Have haskell's error messages be easier to
> > read and understand.
> > 1. Colors and layout to highlight important parts of the error messages
> As I say below, I think #8809 will provide a good foundation for
> improvements here. More on this below.
> > 2. Clear formatting & naming of errors, so they're easily googleable,
> > stack-overflow able, etc.
> Indeed this is a great goal. Do you have a list of error messages that
> you think are particularly egregious in this respect? Are you advocating
> that we give error classes unique identifiers (e.g. as rustc does IIRC)
> or are you merely suggesting that we improve the wording of the existing
> > 3. better hints with error messages, and perhaps integrated lints(?).
> This sounds like a noble goal, but it's a bit unclear how you get there.
> We currently do try to give hints where possible, but of course we could
> always offer more. It would be helpful to have a set of concrete
> examples to discuss.
> > 4. I don't know if this is already possible, but allowing GHC errors to
> > shipped off as JSON or something to interested tooling.
> Indeed, this would be great. Thanks to Matthew Pickering we already
> offer some limited form of this in 8.2 , but I think having more
> structured error documents as suggested in #8809 would make this even
>  https://downloads.haskell.org/~ghc/master/users-guide//
> The State of #8809
> > I saw this ticket on trac: https://ghc.haskell.org/trac/ghc/ticket/8809
> > I would like to take this up, but I'd like help / pointers and stuff. I
> > have GHC setup, I know how to use phabricator, but.. where do I start? :)
> This ticket has recently seen quite a bit of activity and I've been
> meaning to write down some thoughts on it. Here it goes:
> Currently Alfredo Di Napoli is working  on the `pretty` library to
> both improve performance and allow us to drop GHC's fork (see #10735),
> perhaps to use annotated pretty-printer documents. Meanwhile, David
> Luposchainsky, has recently released  his `prettyprinter` library
> which may serve as a drop-in replacement to `pretty` and handles all of
> the cases that Alfredo is working on. Moreover, Shivansh Rai has also
> recently expressed interest in helping out with this effort.
> All of this is great news: I have been hoping we'd get Idris-style
> errors for quite some time. However, given how many hands we have in
> this area, we should be careful not to step on each toes. Below I'll
> describe the various facets of the task as I see them.
>  https://github.com/haskell/pretty/pull/43
>  https://www.reddit.com/r/haskell/comments/6e62i5/ann_
> # Choice of pretty printer
> It seems like we first need to resolve the question of whether switching
> from (our fork of) `pretty` to the `prettyprinter` library is
> worthwhile. The argument for moving to `prettyprinter` is its support
> for optimized infinite-band-width layout, which is one of the things
> holding us back from moving back to `pretty`.
> However, there are two impediments to switching,
> * `prettyprinter` depends upon the `text` package while GHC does not.
> Making GHC dependent on `text` is an option, but we should be
> careful. Adding a dependency has a non-trivial cost (GHC build times
> rise, GHC API users are stuck using whatever dependency versions GHC
> uses, release engineering is a bit more complicated).
> Currently GHC has its own abstractions for working with text
> efficiently, LitString and FastString. FastString is used throughout
> the compiler, including the pretty-printer, and represents a
> dense UTF-8 buffer (and a hash for quick comparison). It's not clear
> that we
> would want to move it to `text` as this would introduce UTF-8/UTF-16
> * `prettyprinter` doesn't support rendering to `String`. This
> essentially means that we either use `Text` or fork the package.
> However, if we have already decided on depending on `text`, then
> perhaps the former isn't so bad.
> It's unclear to me exactly how difficult switching would be compared to
> finishing up the work Alfredo has started on `pretty`. Alfredo, what is
> your opinion?
> If we decide against moving to `prettyprinter`, then we will need to
> finish up something like Alfredo's `pretty` patches to rid GHC of its
> # Representing rich error messages in GHC
> In my opinion we should avoid baking more stylistic decisions (e.g.
> types in red, terms in blue) into the modules like TcErrors which produce
> error messages. This is why I propose that we use annotated
> pretty-printer documents in #8809 (see comment 3). This would allow us
> to represent the typical things seen in GHC error messages (e.g. types,
> terms, spans, binders, etc.) in structured form, allowing the error
> message consumer (e.g. GHC itself, a GHC API user, or some JSON error
> sink) to make decisions about how to present these elements to the user.
> I think this approach give us a much better story for dealing with the
> problems currently solved by flags like -fprint-runtime-reps,
> -fprint-explicit-kinds, etc., especially for users using an IDE.
> As far as I can recall, there was still a bit of disagreement
> surrounding whether the values carried by the error message should be
> statically or dynamically typed. In particular, Richard Eisenberg
> advocated that error message documents look like,
> -- A dynamically typed value embedded in an error message
> data ErrItem = forall a. (Outputable a, Typeable a). ErrItem a
> type ErrDoc = Doc ErrItem
> Whereas I argue that this would quickly become unmaintainable,
> especially when one considers GHC API users. Rather, I say that we
> should encode the "vocabulary" of things that may appear in an error
> message explicitly,
> data ErrItem = ErrType Type
> | ErrSpan Span
> | ErrTerm HsExpr
> | ErrInstance ClsInst
> | ErrVar Var
> | ...
> While there are good arguments for both options, although I think that
> in balance an explicit approach will be better for consumers. Anyways,
> this is a question that will need to be answered.
> Once there is consensus I think it shouldn't be too difficult to move
> things forward. The change can be made incrementally and for the most
> part should only touch a few modules (with the bulk in TcErrors).
> ## What do we represent?
> There is also the question of what the vocabulary of embeddable items
> should consist of. I think the above are pretty non-controversial but I
> can think of a variety of items which would more precisely capture
> some common patterns,
> data ErrItem = ...
> | ErrExpectedActual Type Type
> -- ^ e.g. "Expected type: ty1, Actual type: ty2"
> | ErrContext Type
> -- ^ Like ErrType but specifically captures a context
> | ErrPotentialInstances [ClsInst]
> -- ^ A list of potentially matching instances
> | ...
> Exactly how far we want to go is something that would need to be
> decided. I think we would want to start with the minimal set initially
> proposed and then introduce additional items as we gain experience with
> the scheme.
> # Using rich error messages
> Once we have GHC producing rich error documents we can teach GHC's
> command line driver to prettify them. We can also teach haskell-mode,
> ghc-mod, and friends to preserve their structure to give the user an
> Idris-like experience.
> Exactly how many stylistic decisions we want GHC to make is a tricky
> question; this is prime territory for bike-shedding and people tend to
> have rather strong aesthetic beliefs; keeping things simple while
> satisfying all tastes may be a challenge.
> # Summary
> Above I discussed several tasks and a few questions,
> * We need to decide on whether David's `prettyprinter` library is right
> for GHC; having a prototype patch introducing it to the tree would
> help in evaluating this. Alfredo, what is your opinion here?
> * If not we need to drop our fork of `pretty` in favor of upstream
> * We need consensus on whether Idris-style annotated pretty-printer
> documents are the right approach for GHC (I think we are close to
> * If we want annotated documents, should the items be statically or
> dynamically typed?
> * Once these questions are resolved we can start introducing
> annotations into GHC's error documents (this shouldn't be hard)
> * Then we can teach GHC and associated tooling to pretty-print these
> rich messages prettily
> There is certainly a fair bit of work here although it's not
> obvious how to parallelize it across all of the interested
> parties. Regardless, I would be happy to advise on any bit of this.
> - Ben
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the ghc-devs