Separating typechecking and type error reporting in two passes?

Tue Nov 29 16:52:31 UTC 2016

Hi,

I guess the claim is still true: Think about just the amount of code
you compile when you install your dependencies.

But you are right that when the programmer sits there and waits for a
result, that’s when snappyness is important. I don’t expect this change
to be human-noticable in terms of performance, and the prospect of
getting more useful error messages (which are easier to grasp and fix)
will certainly outweigh that.

Anyways, performance aspects are just a side-effect of this.

Greetings,
Joachim

Am Dienstag, den 29.11.2016, 16:48 +0000 schrieb Alex Rozenshteyn:
> One concern I have is with the claim that "most compiled programs are
> type correct". This has emphatically not been my experience while
> developing Haskell. Often, I edit and recompile to find the next type
> error to fix, or the new type of the hole; this is especially easy
> (and automatic) in emacs with flycheck.
> 
> It's also the case that compiling correct programs is (or at least
> appears to me to be) much slower than compiling incorrect ones (with
> GHC 8.0.1, at least). I worry that this proposal might slow down type
> errors significantly. Of course, someone with more (basically any)
> knowledge of GHC internals can correct me and allay my fear.
> 
> On Tue, Nov 29, 2016 at 8:20 AM Joachim Breitner <mail at joachim-breitn
> er.de> wrote:
> > Hi,
> > 
> > the following is a rough idea that I came up with while pondering
> > #12864, which is about better (and fewer!) error messages when the
> > user
> > forgot an argument to a function, or swapped their order.
> > 
> > 
> > The current scheme of type checking and error reporting is the
> > following (please correct me if I am wrong, I have actually done
> > little
> > in this area so far):
> > 
> >     The typechecker traverses the syntax tree from top to bottom.
> > Along
> >     the way, it keeps track of the current context in terms of SDoc
> >     fragments of the form “In the second argument of…”. It
> > generates
> >     constraints (e.g. the argument type of a function is the type
> > of the
> >     actual argument), which it either solves, or propagate outwards
> > as
> >     “wanted constraints”, which refer to their context, and refer
> > to a
> >     “hole” in the program where, if this constraint gets solved
> > later,
> >     some form of evidence gets filled in.
> >     At the end, if there are any wanted constraints left, they are
> >     reported as type errors, together with the context they are
> >     mentioned.
> >     But with -fdefer-type-errors, they are not reported, but the
> >     evidence holes are filled in with essentially calls to `error`
> > that
> >     print the error message at runtime.
> > 
> > The problem I see with this approach is that the type checker has
> > to
> > remember interesting bits about context that might possibly
> > eventually
> > be relevant when reporting the error. In particular, it makes it
> > harder
> >  to report multiple related error messages at once.
> > 
> > 
> > So I am wondering if this approach would be better:
> > 
> >     The type checker does *not* bother keeping track of context: It
> >     traverses the syntax tree, creates constraints, and unsolvable
> >     constraints get filled with “error evidence”. Basically as if
> >     -fdefer-type-errors is one.
> > 
> >     Then a second pass is done over the syntax tree. This pass does
> > keep
> >     track of the context. Whenever it finds some error evidence, it
> >     reports it.
> > 
> > 
> > I see the following advantages:
> > 
> >  * The type checker code gets a bit simpler and tidier.
> >  * As most compiled programs are type correct [citation needed],
> > the 
> >    type checker, but not the error reporting, is compiler-
> > performance-
> >    critical. This might make type checking a (possibly
> > insignificant)
> >    tad faster.
> >  * The error reporting pass can “look around” more. For example,
> > before
> >    recursing into a pair `(e1,e2)`, it could check for the special
> > case
> >    of `(e1 ▷ TyError Int Bool, e1 ▷ TyError Bool Int)` and report
> > this
> >    as one error “Tuple with swapped arguments” instead of two
> > errors.
> >    The #12864 is a more elaborate application of this.
> >  * There could even be an intermediate pass over the syntax tree
> >    between typechecking and error reporting that “transform” or
> >    “optimizes” the AST by moving type errors around, by coalescing
> >    error or other things, all with the aim of giving easier to
> >    understand error messages.
> >  * (This is getting into the area of wild dreams:)
> >    Library authors could somehow express custom error messages
> >    for certain patterns of the form
> >    If the AST now contains an application of `Library.foo` where
> >    the first argument is a type error `TyError Int Bool`, then
> >    give this particular domain-specific error message.
> > 
> > I don’t see any disadvantages now, but there probably are some, and
> > I
> > would be grateful about feedback from those who actually have
> > worked
> > with this part of GHC before.
> > 
> > Thanks,
> > Joachim
> > 
> > --
> > Joachim “nomeata” Breitner
> >   mail at joachim-breitner.de • https://www.joachim-breitner.de/
> >   XMPP: nomeata at joachim-breitner.de • OpenPGP-Key: 0xF0FBF51F
> >   Debian Developer: nomeata at debian.org_____________________________
> > __________________
> > ghc-devs mailing list
> > ghc-devs at haskell.org
> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
> > 
> 
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
-- 
Joachim Breitner
  mail at joachim-breitner.de
  http://www.joachim-breitner.de/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20161129/8a904646/attachment.sig>