<div dir="ltr">One concern I have is with the claim that "<span style="color:rgb(33,33,33)">most compiled programs are type correct". This has emphatically not been my experience while developing Haskell. Often, I edit and recompile to find the next type error to fix, or the new type of the hole; this is especially easy (and automatic) in emacs with flycheck.</span><div><span style="color:rgb(33,33,33)"><br></span></div><div><span style="color:rgb(33,33,33)">It's also the case that compiling correct programs is (or at least appears to me to be) much slower than compiling incorrect ones (with GHC 8.0.1, at least). I worry that this proposal might slow down type errors significantly. Of course, someone with more (basically any) knowledge of GHC internals can correct me and allay my fear.</span></div></div><br><div class="gmail_quote"><div dir="ltr">On Tue, Nov 29, 2016 at 8:20 AM Joachim Breitner <<a href="mailto:mail@joachim-breitner.de">mail@joachim-breitner.de</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi,<br class="gmail_msg">

<br class="gmail_msg">

the following is a rough idea that I came up with while pondering<br class="gmail_msg">

#12864, which is about better (and fewer!) error messages when the user<br class="gmail_msg">

forgot an argument to a function, or swapped their order.<br class="gmail_msg">

<br class="gmail_msg">

<br class="gmail_msg">

The current scheme of type checking and error reporting is the<br class="gmail_msg">

following (please correct me if I am wrong, I have actually done little<br class="gmail_msg">

in this area so far):<br class="gmail_msg">

<br class="gmail_msg">

    The typechecker traverses the syntax tree from top to bottom. Along<br class="gmail_msg">

    the way, it keeps track of the current context in terms of SDoc<br class="gmail_msg">

    fragments of the form “In the second argument of…”. It generates<br class="gmail_msg">

    constraints (e.g. the argument type of a function is the type of the<br class="gmail_msg">

    actual argument), which it either solves, or propagate outwards as<br class="gmail_msg">

    “wanted constraints”, which refer to their context, and refer to a<br class="gmail_msg">

    “hole” in the program where, if this constraint gets solved later,<br class="gmail_msg">

    some form of evidence gets filled in.<br class="gmail_msg">

    At the end, if there are any wanted constraints left, they are<br class="gmail_msg">

    reported as type errors, together with the context they are<br class="gmail_msg">

    mentioned.<br class="gmail_msg">

    But with -fdefer-type-errors, they are not reported, but the<br class="gmail_msg">

    evidence holes are filled in with essentially calls to `error` that<br class="gmail_msg">

    print the error message at runtime.<br class="gmail_msg">

<br class="gmail_msg">

The problem I see with this approach is that the type checker has to<br class="gmail_msg">

remember interesting bits about context that might possibly eventually<br class="gmail_msg">

be relevant when reporting the error. In particular, it makes it harder<br class="gmail_msg">

 to report multiple related error messages at once.<br class="gmail_msg">

<br class="gmail_msg">

<br class="gmail_msg">

So I am wondering if this approach would be better:<br class="gmail_msg">

<br class="gmail_msg">

    The type checker does *not* bother keeping track of context: It<br class="gmail_msg">

    traverses the syntax tree, creates constraints, and unsolvable<br class="gmail_msg">

    constraints get filled with “error evidence”. Basically as if<br class="gmail_msg">

    -fdefer-type-errors is one.<br class="gmail_msg">

<br class="gmail_msg">

    Then a second pass is done over the syntax tree. This pass does keep<br class="gmail_msg">

    track of the context. Whenever it finds some error evidence, it<br class="gmail_msg">

    reports it.<br class="gmail_msg">

<br class="gmail_msg">

<br class="gmail_msg">

I see the following advantages:<br class="gmail_msg">

<br class="gmail_msg">

 * The type checker code gets a bit simpler and tidier.<br class="gmail_msg">

 * As most compiled programs are type correct [citation needed], the <br class="gmail_msg">

   type checker, but not the error reporting, is compiler-performance-<br class="gmail_msg">

   critical. This might make type checking a (possibly insignificant)<br class="gmail_msg">

   tad faster.<br class="gmail_msg">

 * The error reporting pass can “look around” more. For example, before<br class="gmail_msg">

   recursing into a pair `(e1,e2)`, it could check for the special case<br class="gmail_msg">

   of `(e1 ▷ TyError Int Bool, e1 ▷ TyError Bool Int)` and report this<br class="gmail_msg">

   as one error “Tuple with swapped arguments” instead of two errors.<br class="gmail_msg">

   The #12864 is a more elaborate application of this.<br class="gmail_msg">

 * There could even be an intermediate pass over the syntax tree<br class="gmail_msg">

   between typechecking and error reporting that “transform” or<br class="gmail_msg">

   “optimizes” the AST by moving type errors around, by coalescing<br class="gmail_msg">

   error or other things, all with the aim of giving easier to<br class="gmail_msg">

   understand error messages.<br class="gmail_msg">

 * (This is getting into the area of wild dreams:)<br class="gmail_msg">

   Library authors could somehow express custom error messages<br class="gmail_msg">

   for certain patterns of the form<br class="gmail_msg">

   If the AST now contains an application of `Library.foo` where<br class="gmail_msg">

   the first argument is a type error `TyError Int Bool`, then<br class="gmail_msg">

   give this particular domain-specific error message.<br class="gmail_msg">

<br class="gmail_msg">

I don’t see any disadvantages now, but there probably are some, and I<br class="gmail_msg">

would be grateful about feedback from those who actually have worked<br class="gmail_msg">

with this part of GHC before.<br class="gmail_msg">

<br class="gmail_msg">

Thanks,<br class="gmail_msg">

Joachim<br class="gmail_msg">

<br class="gmail_msg">

--<br class="gmail_msg">

Joachim “nomeata” Breitner<br class="gmail_msg">

  <a href="mailto:mail@joachim-breitner.de" class="gmail_msg" target="_blank">mail@joachim-breitner.de</a> • <a href="https://www.joachim-breitner.de/" rel="noreferrer" class="gmail_msg" target="_blank">https://www.joachim-breitner.de/</a><br class="gmail_msg">

  XMPP: <a href="mailto:nomeata@joachim-breitner.de" class="gmail_msg" target="_blank">nomeata@joachim-breitner.de</a> • OpenPGP-Key: 0xF0FBF51F<br class="gmail_msg">

  Debian Developer: <a href="mailto:nomeata@debian.org" class="gmail_msg" target="_blank">nomeata@debian.org</a>_______________________________________________<br class="gmail_msg">

ghc-devs mailing list<br class="gmail_msg">

<a href="mailto:ghc-devs@haskell.org" class="gmail_msg" target="_blank">ghc-devs@haskell.org</a><br class="gmail_msg">

<a href="http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs" rel="noreferrer" class="gmail_msg" target="_blank">http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs</a><br class="gmail_msg">

</blockquote></div>