Interested to help with error messages

Sat Jun 3 15:56:58 UTC 2017

Thanks Ben, a great summary.  Is there a Wiki page for this? It feels like
it should be on one, so we can easily comment/update the individual points.

In terms of the pretty-printer and its string type. Perhaps we could
backpackify it to use http://next.hackage.haskell.org:8080/package/str-sig,
and then specialise the GHC version to FastString etc.

Alan

On 3 June 2017 at 17:50, Ben Gamari <ben at smart-cactus.org> wrote:

>
> CCing,
>    * Alfredo Di Napoli for his on-going work in this area
>    * Shivansh Rai for his interest in contributing
>    * David Luposchainsky for his recent pretty-printer library
>    * Richard Eisenberg due to his participation in #8809
>    * Bartosz for his participation in #10735
>    * Alan Zimmerman for his interest in Haskell tooling
>
> My apologies for the tome that follows. I have been thinking about this
> problem recently and think an overview of where we stand would be helpful.
>
>
> Siddharth Bhat <siddu.druid at gmail.com> writes:
>
> > Hello all,
> >
> > Thanks for all the work that's put into GHC :)
> >
> Thanks for your interest in helping!
>
> > I've tried to get into GHC development before, but I was unsuccessful,
> > mostly because I didn't dedicate enough time to understanding the problem
> > at hand & exploring the codebase.
> >
> > I'd like to give it another shot. This time, I think I have a clear
> vision
> > of what I want to help with: Have haskell's error messages be easier to
> > read and understand.
> >
> > 1. Colors and layout to highlight important parts of the error messages
>
> As I say below, I think #8809 will provide a good foundation for
> improvements here. More on this below.
>
> > 2. Clear formatting & naming of errors, so they're easily googleable,
> > stack-overflow able, etc.
>
> Indeed this is a great goal. Do you have a list of error messages that
> you think are particularly egregious in this respect? Are you advocating
> that we give error classes unique identifiers (e.g. as rustc does IIRC)
> or are you merely suggesting that we improve the wording of the existing
> messages?
>
> > 3. better hints with error messages, and perhaps integrated lints(?).
>
> This sounds like a noble goal, but it's a bit unclear how you get there.
> We currently do try to give hints where possible, but of course we could
> always offer more. It would be helpful to have a set of concrete
> examples to discuss.
>
> > 4. I don't know if this is already possible, but allowing GHC errors to
> be
> > shipped off as JSON or something to interested tooling.
> >
> Indeed, this would be great. Thanks to Matthew Pickering we already
> offer some limited form of this in 8.2 [1], but I think having more
> structured error documents as suggested in #8809 would make this even
> nicer.
>
> [1] https://downloads.haskell.org/~ghc/master/users-guide//
> debugging.html?highlight=json#ghc-flag--ddump-json
>
>
>
> The State of #8809
> ==================
>
> > I saw this ticket on trac: https://ghc.haskell.org/trac/ghc/ticket/8809
> > I would like to take this up, but I'd like help / pointers and stuff. I
> > have GHC setup, I know how to use phabricator, but.. where do I start? :)
> >
> This ticket has recently seen quite a bit of activity and I've been
> meaning to write down some thoughts on it. Here it goes:
>
> Currently Alfredo Di Napoli is working [2] on the `pretty` library to
> both improve performance and allow us to drop GHC's fork (see #10735),
> perhaps to use annotated pretty-printer documents. Meanwhile, David
> Luposchainsky, has recently released [3] his `prettyprinter` library
> which may serve as a drop-in replacement to `pretty` and handles all of
> the cases that Alfredo is working on. Moreover, Shivansh Rai has also
> recently expressed interest in helping out with this effort.
>
> All of this is great news: I have been hoping we'd get Idris-style
> errors for quite some time. However, given how many hands we have in
> this area, we should be careful not to step on each toes. Below I'll
> describe the various facets of the task as I see them.
>
> [2] https://github.com/haskell/pretty/pull/43
> [3] https://www.reddit.com/r/haskell/comments/6e62i5/ann_
> prettyprinter_10_ending_the_wadlerleijen_zoo/
>
>
> # Choice of pretty printer
>
> It seems like we first need to resolve the question of whether switching
> from (our fork of) `pretty` to the `prettyprinter` library is
> worthwhile. The argument for moving to `prettyprinter` is its support
> for optimized infinite-band-width layout, which is one of the things
> holding us back from moving back to `pretty`.
>
> However, there are two impediments to switching,
>
>  * `prettyprinter` depends upon the `text` package while GHC does not.
>    Making GHC dependent on `text` is an option, but we should be
>    careful. Adding a dependency has a non-trivial cost (GHC build times
>    rise, GHC API users are stuck using whatever dependency versions GHC
>    uses, release engineering is a bit more complicated).
>
>    Currently GHC has its own abstractions for working with text
>    efficiently, LitString and FastString. FastString is used throughout
>    the compiler, including the pretty-printer, and represents a
>    dense UTF-8 buffer (and a hash for quick comparison). It's not clear
> that we
>    would want to move it to `text` as this would introduce UTF-8/UTF-16
>    conversion.
>
>  * `prettyprinter` doesn't support rendering to `String`. This
>    essentially means that we either use `Text` or fork the package.
>    However, if we have already decided on depending on `text`, then
>    perhaps the former isn't so bad.
>
> It's unclear to me exactly how difficult switching would be compared to
> finishing up the work Alfredo has started on `pretty`. Alfredo, what is
> your opinion?
>
> If we decide against moving to `prettyprinter`, then we will need to
> finish up something like Alfredo's `pretty` patches to rid GHC of its
> fork.
>
>
> # Representing rich error messages in GHC
>
> In my opinion we should avoid baking more stylistic decisions (e.g.
> printing
> types in red, terms in blue) into the modules like TcErrors which produce
> error messages. This is why I propose that we use annotated
> pretty-printer documents in #8809 (see comment 3). This would allow us
> to represent the typical things seen in GHC error messages (e.g. types,
> terms, spans, binders, etc.) in structured form, allowing the error
> message consumer (e.g. GHC itself, a GHC API user, or some JSON error
> sink) to make decisions about how to present these elements to the user.
>
> I think this approach give us a much better story for dealing with the
> problems currently solved by flags like -fprint-runtime-reps,
> -fprint-explicit-kinds, etc., especially for users using an IDE.
>
> As far as I can recall, there was still a bit of disagreement
> surrounding whether the values carried by the error message should be
> statically or dynamically typed. In particular, Richard Eisenberg
> advocated that error message documents look like,
>
>     -- A dynamically typed value embedded in an error message
>     data ErrItem = forall a. (Outputable a, Typeable a). ErrItem a
>
>     type ErrDoc = Doc ErrItem
>
> Whereas I argue that this would quickly become unmaintainable,
> especially when one considers GHC API users. Rather, I say that we
> should encode the "vocabulary" of things that may appear in an error
> message explicitly,
>
>     data ErrItem = ErrType Type
>                  | ErrSpan Span
>                  | ErrTerm HsExpr
>                  | ErrInstance ClsInst
>                  | ErrVar  Var
>                  | ...
>
> While there are good arguments for both options, although I think that
> in balance an explicit approach will be better for consumers. Anyways,
> this is a question that will need to be answered.
>
> Once there is consensus I think it shouldn't be too difficult to move
> things forward. The change can be made incrementally and for the most
> part should only touch a few modules (with the bulk in TcErrors).
>
>
> ## What do we represent?
>
> There is also the question of what the vocabulary of embeddable items
> should consist of. I think the above are pretty non-controversial but I
> can think of a variety of items which would more precisely capture
> some common patterns,
>
>     data ErrItem = ...
>                  | ErrExpectedActual Type Type
>                    -- ^ e.g. "Expected type: ty1, Actual type: ty2"
>                  | ErrContext Type
>                    -- ^ Like ErrType but specifically captures a context
>                  | ErrPotentialInstances [ClsInst]
>                    -- ^ A list of potentially matching instances
>                  | ...
>
> Exactly how far we want to go is something that would need to be
> decided. I think we would want to start with the minimal set initially
> proposed and then introduce additional items as we gain experience with
> the scheme.
>
>
> # Using rich error messages
>
> Once we have GHC producing rich error documents we can teach GHC's
> command line driver to prettify them. We can also teach haskell-mode,
> ghc-mod, and friends to preserve their structure to give the user an
> Idris-like experience.
>
> Exactly how many stylistic decisions we want GHC to make is a tricky
> question; this is prime territory for bike-shedding and people tend to
> have rather strong aesthetic beliefs; keeping things simple while
> satisfying all tastes may be a challenge.
>
>
> # Summary
>
> Above I discussed several tasks and a few questions,
>
>  * We need to decide on whether David's `prettyprinter` library is right
>    for GHC; having a prototype patch introducing it to the tree would
>    help in evaluating this. Alfredo, what is your opinion here?
>
>  * If not we need to drop our fork of `pretty` in favor of upstream
>
>  * We need consensus on whether Idris-style annotated pretty-printer
>    documents are the right approach for GHC (I think we are close to
>    this)
>
>  * If we want annotated documents, should the items be statically or
>    dynamically typed?
>
>  * Once these questions are resolved we can start introducing
>    annotations into GHC's error documents (this shouldn't be hard)
>
>  * Then we can teach GHC and associated tooling to pretty-print these
>    rich messages prettily
>
> There is certainly a fair bit of work here although it's not
> obvious how to parallelize it across all of the interested
> parties. Regardless, I would be happy to advise on any bit of this.
>
> Cheers,
>
> - Ben
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20170603/fcbfeefb/attachment-0001.html>