Separating typechecking and type error reporting in two passes?
Joachim Breitner
mail at joachim-breitner.de
Tue Nov 29 16:52:31 UTC 2016
Hi,
I guess the claim is still true: Think about just the amount of code
you compile when you install your dependencies.
But you are right that when the programmer sits there and waits for a
result, that’s when snappyness is important. I don’t expect this change
to be human-noticable in terms of performance, and the prospect of
getting more useful error messages (which are easier to grasp and fix)
will certainly outweigh that.
Anyways, performance aspects are just a side-effect of this.
Greetings,
Joachim
Am Dienstag, den 29.11.2016, 16:48 +0000 schrieb Alex Rozenshteyn:
> One concern I have is with the claim that "most compiled programs are
> type correct". This has emphatically not been my experience while
> developing Haskell. Often, I edit and recompile to find the next type
> error to fix, or the new type of the hole; this is especially easy
> (and automatic) in emacs with flycheck.
>
> It's also the case that compiling correct programs is (or at least
> appears to me to be) much slower than compiling incorrect ones (with
> GHC 8.0.1, at least). I worry that this proposal might slow down type
> errors significantly. Of course, someone with more (basically any)
> knowledge of GHC internals can correct me and allay my fear.
>
> On Tue, Nov 29, 2016 at 8:20 AM Joachim Breitner <mail at joachim-breitn
> er.de> wrote:
> > Hi,
> >
> > the following is a rough idea that I came up with while pondering
> > #12864, which is about better (and fewer!) error messages when the
> > user
> > forgot an argument to a function, or swapped their order.
> >
> >
> > The current scheme of type checking and error reporting is the
> > following (please correct me if I am wrong, I have actually done
> > little
> > in this area so far):
> >
> > The typechecker traverses the syntax tree from top to bottom.
> > Along
> > the way, it keeps track of the current context in terms of SDoc
> > fragments of the form “In the second argument of…”. It
> > generates
> > constraints (e.g. the argument type of a function is the type
> > of the
> > actual argument), which it either solves, or propagate outwards
> > as
> > “wanted constraints”, which refer to their context, and refer
> > to a
> > “hole” in the program where, if this constraint gets solved
> > later,
> > some form of evidence gets filled in.
> > At the end, if there are any wanted constraints left, they are
> > reported as type errors, together with the context they are
> > mentioned.
> > But with -fdefer-type-errors, they are not reported, but the
> > evidence holes are filled in with essentially calls to `error`
> > that
> > print the error message at runtime.
> >
> > The problem I see with this approach is that the type checker has
> > to
> > remember interesting bits about context that might possibly
> > eventually
> > be relevant when reporting the error. In particular, it makes it
> > harder
> > to report multiple related error messages at once.
> >
> >
> > So I am wondering if this approach would be better:
> >
> > The type checker does *not* bother keeping track of context: It
> > traverses the syntax tree, creates constraints, and unsolvable
> > constraints get filled with “error evidence”. Basically as if
> > -fdefer-type-errors is one.
> >
> > Then a second pass is done over the syntax tree. This pass does
> > keep
> > track of the context. Whenever it finds some error evidence, it
> > reports it.
> >
> >
> > I see the following advantages:
> >
> > * The type checker code gets a bit simpler and tidier.
> > * As most compiled programs are type correct [citation needed],
> > the
> > type checker, but not the error reporting, is compiler-
> > performance-
> > critical. This might make type checking a (possibly
> > insignificant)
> > tad faster.
> > * The error reporting pass can “look around” more. For example,
> > before
> > recursing into a pair `(e1,e2)`, it could check for the special
> > case
> > of `(e1 ▷ TyError Int Bool, e1 ▷ TyError Bool Int)` and report
> > this
> > as one error “Tuple with swapped arguments” instead of two
> > errors.
> > The #12864 is a more elaborate application of this.
> > * There could even be an intermediate pass over the syntax tree
> > between typechecking and error reporting that “transform” or
> > “optimizes” the AST by moving type errors around, by coalescing
> > error or other things, all with the aim of giving easier to
> > understand error messages.
> > * (This is getting into the area of wild dreams:)
> > Library authors could somehow express custom error messages
> > for certain patterns of the form
> > If the AST now contains an application of `Library.foo` where
> > the first argument is a type error `TyError Int Bool`, then
> > give this particular domain-specific error message.
> >
> > I don’t see any disadvantages now, but there probably are some, and
> > I
> > would be grateful about feedback from those who actually have
> > worked
> > with this part of GHC before.
> >
> > Thanks,
> > Joachim
> >
> > --
> > Joachim “nomeata” Breitner
> > mail at joachim-breitner.de • https://www.joachim-breitner.de/
> > XMPP: nomeata at joachim-breitner.de • OpenPGP-Key: 0xF0FBF51F
> > Debian Developer: nomeata at debian.org_____________________________
> > __________________
> > ghc-devs mailing list
> > ghc-devs at haskell.org
> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
> >
>
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
--
Joachim Breitner
mail at joachim-breitner.de
http://www.joachim-breitner.de/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20161129/8a904646/attachment.sig>
More information about the ghc-devs
mailing list