<div dir="ltr">One concern I have is with the claim that "<span style="color:rgb(33,33,33)">most compiled programs are type correct". This has emphatically not been my experience while developing Haskell. Often, I edit and recompile to find the next type error to fix, or the new type of the hole; this is especially easy (and automatic) in emacs with flycheck.</span><div><span style="color:rgb(33,33,33)"><br></span></div><div><span style="color:rgb(33,33,33)">It's also the case that compiling correct programs is (or at least appears to me to be) much slower than compiling incorrect ones (with GHC 8.0.1, at least). I worry that this proposal might slow down type errors significantly. Of course, someone with more (basically any) knowledge of GHC internals can correct me and allay my fear.</span></div></div><br><div class="gmail_quote"><div dir="ltr">On Tue, Nov 29, 2016 at 8:20 AM Joachim Breitner <<a href="mailto:mail@joachim-breitner.de">mail@joachim-breitner.de</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi,<br class="gmail_msg">
<br class="gmail_msg">
the following is a rough idea that I came up with while pondering<br class="gmail_msg">
#12864, which is about better (and fewer!) error messages when the user<br class="gmail_msg">
forgot an argument to a function, or swapped their order.<br class="gmail_msg">
<br class="gmail_msg">
<br class="gmail_msg">
The current scheme of type checking and error reporting is the<br class="gmail_msg">
following (please correct me if I am wrong, I have actually done little<br class="gmail_msg">
in this area so far):<br class="gmail_msg">
<br class="gmail_msg">
The typechecker traverses the syntax tree from top to bottom. Along<br class="gmail_msg">
the way, it keeps track of the current context in terms of SDoc<br class="gmail_msg">
fragments of the form “In the second argument of…”. It generates<br class="gmail_msg">
constraints (e.g. the argument type of a function is the type of the<br class="gmail_msg">
actual argument), which it either solves, or propagate outwards as<br class="gmail_msg">
“wanted constraints”, which refer to their context, and refer to a<br class="gmail_msg">
“hole” in the program where, if this constraint gets solved later,<br class="gmail_msg">
some form of evidence gets filled in.<br class="gmail_msg">
At the end, if there are any wanted constraints left, they are<br class="gmail_msg">
reported as type errors, together with the context they are<br class="gmail_msg">
mentioned.<br class="gmail_msg">
But with -fdefer-type-errors, they are not reported, but the<br class="gmail_msg">
evidence holes are filled in with essentially calls to `error` that<br class="gmail_msg">
print the error message at runtime.<br class="gmail_msg">
<br class="gmail_msg">
The problem I see with this approach is that the type checker has to<br class="gmail_msg">
remember interesting bits about context that might possibly eventually<br class="gmail_msg">
be relevant when reporting the error. In particular, it makes it harder<br class="gmail_msg">
to report multiple related error messages at once.<br class="gmail_msg">
<br class="gmail_msg">
<br class="gmail_msg">
So I am wondering if this approach would be better:<br class="gmail_msg">
<br class="gmail_msg">
The type checker does *not* bother keeping track of context: It<br class="gmail_msg">
traverses the syntax tree, creates constraints, and unsolvable<br class="gmail_msg">
constraints get filled with “error evidence”. Basically as if<br class="gmail_msg">
-fdefer-type-errors is one.<br class="gmail_msg">
<br class="gmail_msg">
Then a second pass is done over the syntax tree. This pass does keep<br class="gmail_msg">
track of the context. Whenever it finds some error evidence, it<br class="gmail_msg">
reports it.<br class="gmail_msg">
<br class="gmail_msg">
<br class="gmail_msg">
I see the following advantages:<br class="gmail_msg">
<br class="gmail_msg">
* The type checker code gets a bit simpler and tidier.<br class="gmail_msg">
* As most compiled programs are type correct [citation needed], the <br class="gmail_msg">
type checker, but not the error reporting, is compiler-performance-<br class="gmail_msg">
critical. This might make type checking a (possibly insignificant)<br class="gmail_msg">
tad faster.<br class="gmail_msg">
* The error reporting pass can “look around” more. For example, before<br class="gmail_msg">
recursing into a pair `(e1,e2)`, it could check for the special case<br class="gmail_msg">
of `(e1 ▷ TyError Int Bool, e1 ▷ TyError Bool Int)` and report this<br class="gmail_msg">
as one error “Tuple with swapped arguments” instead of two errors.<br class="gmail_msg">
The #12864 is a more elaborate application of this.<br class="gmail_msg">
* There could even be an intermediate pass over the syntax tree<br class="gmail_msg">
between typechecking and error reporting that “transform” or<br class="gmail_msg">
“optimizes” the AST by moving type errors around, by coalescing<br class="gmail_msg">
error or other things, all with the aim of giving easier to<br class="gmail_msg">
understand error messages.<br class="gmail_msg">
* (This is getting into the area of wild dreams:)<br class="gmail_msg">
Library authors could somehow express custom error messages<br class="gmail_msg">
for certain patterns of the form<br class="gmail_msg">
If the AST now contains an application of `Library.foo` where<br class="gmail_msg">
the first argument is a type error `TyError Int Bool`, then<br class="gmail_msg">
give this particular domain-specific error message.<br class="gmail_msg">
<br class="gmail_msg">
I don’t see any disadvantages now, but there probably are some, and I<br class="gmail_msg">
would be grateful about feedback from those who actually have worked<br class="gmail_msg">
with this part of GHC before.<br class="gmail_msg">
<br class="gmail_msg">
Thanks,<br class="gmail_msg">
Joachim<br class="gmail_msg">
<br class="gmail_msg">
--<br class="gmail_msg">
Joachim “nomeata” Breitner<br class="gmail_msg">
<a href="mailto:mail@joachim-breitner.de" class="gmail_msg" target="_blank">mail@joachim-breitner.de</a> • <a href="https://www.joachim-breitner.de/" rel="noreferrer" class="gmail_msg" target="_blank">https://www.joachim-breitner.de/</a><br class="gmail_msg">
XMPP: <a href="mailto:nomeata@joachim-breitner.de" class="gmail_msg" target="_blank">nomeata@joachim-breitner.de</a> • OpenPGP-Key: 0xF0FBF51F<br class="gmail_msg">
Debian Developer: <a href="mailto:nomeata@debian.org" class="gmail_msg" target="_blank">nomeata@debian.org</a>_______________________________________________<br class="gmail_msg">
ghc-devs mailing list<br class="gmail_msg">
<a href="mailto:ghc-devs@haskell.org" class="gmail_msg" target="_blank">ghc-devs@haskell.org</a><br class="gmail_msg">
<a href="http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs" rel="noreferrer" class="gmail_msg" target="_blank">http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs</a><br class="gmail_msg">
</blockquote></div>