value of documenting error messages?

Wed Jun 2 19:03:25 UTC 2021

> On Jun 2, 2021, at 2:48 PM, Tom Ellis <tom-lists-haskell-cafe-2017 at jaguarpaw.co.uk> wrote:
> 
> On Wed, Jun 02, 2021 at 06:13:17PM +0000, Richard Eisenberg wrote:
>> I'm in favor of short, undescriptive, quite-possibly numeric error
>> codes.
> 
> These responses are so completely opposite to what I expected that I
> can't help thinking I've made a fundamental error in my understanding
> of what we're trying to achieve!  Since no one has suggested any
> support for the idea of descriptive error codes I'm pressing on mostly
> in the hope that someone will be able to see from where my
> misunderstanding arises and set me straight.

I see no place where our understandings have diverged, just our opinions. But I may be missing something, too, of course! (For the record, I don't see your suggestion as unreasonable; I just think it's inferior to terse non-descriptive identifiers.)

> 
> Before I continue, I'd like to suggest that this is very much a
> user-facing issue and I would be strongly in favour of actually asking
> users about what they prefer (and allowing them to discuss for a
> while) rather than taking a straw poll amongst GHC developers.
> 
> To that end, would it be inappropriate of me to link this discussion
> to Haskell Reddit and/or Haskell Discourse?

Not this discussion, but I think a discussion there is a good idea. This thread started as a question about documenting constructors in the GHC source code, and it has (rightly!) moved to be about documenting error messages more generally. I (myopically) had not connected these two, and I'm glad for the direction this has taken. But I think the user-facing discussion should have a different starting point than this thread.

I don't think I currently have the bandwidth for that discussion, but if no one else starts it, I will before too much longer.

> 
> I don't understand at all why it's valuable to sequentialize.  Is the
> relative ordering of error codes meaningful in some way?

No. Sequentialization is good because it allows for the production of a new, unique member of the class, with a minimum of storage requirements (that is, you just store the greatest such member, and you know the next one up is unique).

> 
> I don't see why deprecating an error code and reintroducing it is a
> problem any more than doing the same to a function or GHC extension.

It is definitely worse than a function, because functions are associated with a particular version. If GHC 9 and GHC 13 have a function of the same name but different meanings, I don't see how that causes trouble.

Extensions and error codes, on the other hand, are more troublesome, because they get documented and discussed widely online, and web pages live forever. And it is currently sometimes problematic when extensions change meaning over time, leading to conversations I've seen about adding version numbers to extensions. I don't think we've had an extension disappear and then reappear, because removing an extension is very, very hard. Error messages, on the hand, will be much more fluid.

> 
> 
>> Easy to make compositional. We can choose to have all GHC error
>> codes begin with G (for GHC). Then Cabal could use C, Haddock could
>> use H, and Stack could use S. This makes it easy for users to tell
>> (once they've learned the scheme) where an error has come from.
> 
> Surely the same holds for descriptive error codes.  One could have
> G_conflicting_trait_implementations, H_malformatted_section_header,
> ...

Yes, I thought you might say that. But now these are mixed, with an opaque component and a more descriptive one. Better would be ghc_conflicting_trait_implementations, but that's even longer!

> 
>> Short.
> 
> Again I must be misunderstanding.  Why is brevity valuable?  Aren't we
> expecting users to read these things and look them up?  Copy/paste is
> free.

Short things are easier to format? Yes, I agree that brevity is harder to motivate here. Yet I also think that, say, pasting the entire error message text would be wrong, too. So why do we want abbreviations at all? I think: it's to be sure we're looking up what we intend to look up, something served nicely by guaranteed uniqueness.

> 
>> No chance for misspellings during transcription. When I'm copying a
>> terse identifier, I know I have to get every glyph correct. If I
>> remember that the error code is "bad_generalizing", I might not know
>> how "generalizing" is spelled. I might also forget whether it's
>> "generalizing" or "generalization". And I could very easily see
>> myself making either both of these mistakes as I'm switching from
>> one window to another, in under a second.
> 
> Surely it's just as easy to mistype E159 as E195 as it is to misspell
> "generalise".  As above, copy/paste is free and if we *really* want to
> be helpful then instead of naked error codes we should give URLs whch
> directly link to sections in the GHC users guide (or other appropriate
> resource).

I'd be very happy with URLs.

> 
>> Disadvantages:
> 
>> The code does not impart semantic meaning. But I argue this is not
>> so bad, as even a more descriptive code does not impart a precise
>> enough semantic meaning to be helpful.
> 
> I challenge you to name your next GHC extension X25!

I *am* waiting for the day when I can figure out what -XKCD does.
> 
> Possibly ... possibly not.
> 
> "Hey Anna, what should I do about E159?"
> 
> "Hey Anna, what should I do about conflicting_trait_implementations?"
> 
> Which would I prefer to shout to my colleague across the room?

It depends on the colleague. There's a chance she knows about E159, and then the first works fine. There's a chance she doesn't know that conflicting_trait_implementations is an error code, and then she goes on a long lecture about conflicting trait implementations (but not about your error); then the second one fails.

> 
> 
> To me this seems like a rare opportunity to do something where people
> will say "Hey look, that formidable Haskell compiler is doing
> something that's friendlier than the equivalent in any other
> compiler!".  For such an important user-facing feature I don't
> understand why we're not asking users what they prefer.

I agree completely here! Let's ask! (Remember that this thread, posted to ghc-devs, was originally about documenting the GHC source code, something that would not affect users.)

Richard