Error messages: diagnostic is preferable to prescriptive

Error messages: diagnostic is preferable to prescriptive

Rate This
  • Comments 35

The new LINQ features are going to create new failure modes for the compiler, so we're going to need to create some new error messages. The compiler development team got together the other day to discuss what makes an error message good or bad. I thought I'd share with you guys what we came up with. We believe that good error messages are:

  • Polite: making the user feel like an idiot is very, very bad.
  • Readable: poor grammar and tortured sentence structure is bad.
  • Accurate: error messages must accurately describe the problem.
  • Precise: "Something is wrong" is an accurate error message but not a very precise one!
  • Diagnostic but not prescriptive: describe the problem, not the solution.

The first four are obviously goodness. That last one is a little more controversial. Surely a good error message not only tells you what is wrong but helps you fix it, no?

The issue is that deducing what is wrong with bad code is hard enough. Trying to read the user's mind and figure out what they were thinking when they wrote the bad code, and then telling them how to correctly implement that thought is not something that we feel we can do with sufficiently high accuracy in most situations.

Look at it this way: suppose we pull off a miracle and manage to produce error messages which 90% of the time tell the user the correct way to fix their code so that it does what they want it to do. That means that 10% of the time we are telling people how to write a syntactically correct program that does something different than they intended! Pushing people towards writing buggy programs that still compile is very bad, and we do not want to go there.

It's instructive to look at a few places where we violated these guidelines in earlier versions of the compiler:

Static member 'Baz' cannot be marked as override, virtual or abstract

If the user wrote static virtual, then we don't know what the heck they meant to do. Assuming that they meant to say static and that the virtual is wrong is a little presumptuous. Maybe the static is the wrong part! Also, if the user said static virtual, then why is the error message mentioning override and abstract? That's accurate but not precise. A better error message in this case would be something like

Member 'Baz' cannot be both static and virtual

Here's another place where we get it wrong, but this one is more subtle:

A params parameter must be the last parameter in a formal parameter list

This is an example of an English sentence that can be interpreted different ways depending on the context. If I said "a punctuation mark must be the last symbol of a sentence" then I mean that every sentence must end in a punctuation mark, but I do not mean that punctuation marks are only legal at the end of a sentence. If I said "a period must be the last symbol of a statement" then I mean that every statement must end in a period, and furthermore that periods are forbidden anywhere else in the statement.

You and I know that what the error message is trying to say is that if there is a params then it must go at the end. But based solely on this error message, a user would be entirely logically justified in thinking that (int i) is an illegal parameter list because it doesn't end with a params parameter. Or, under another interpretation, they'd also be logically justified in concluding that (params int[] foo, params int[] bar) is legal, because it does end with a params parameter.

The portion of the specification which the error message is attempting to draw attention to is of course "If a formal parameter list includes a parameter array then it must be the last parameter in the list. There can only be one parameter array for a given method" which is nicely unambiguous. Why not simply use this quote from the specification for the error message? That's a reasonable idea, but it sounds a little stiff and doesn't call out where the problem is. I'd prefer:

Method 'Foo' has a parameter array parameter which is not the last parameter in the formal parameter list.

This tells you what is wrong without telling you how to fix it. Since we don't know how to fix it – whether the user should be removing the params modifier, or moving it to the end, or rewriting their method from scratch – we should just report the spec violation and let them sort it out.

There are times when we do want to tell the user what to do, but only when we are highly likely to be correct. For example:

User-defined operator 'Blah' must be declared static and public.

Here we are both diagnosing the problem and prescribing a solution. If they're trying to make a user-defined operator, this is what they absolutely must do to be successful. It is very unlikely that they wanted to make a private instance function and made a private instance operator by mistake!

This illustrates another principle of good error messages that I didn't call out before: good error messages use precise terminology from the standard rather than making up new jargon. Yes "formal parameter list" and "user-defined operator" are a little bit stiff, but they are also clearly defined in the standard.

Sometimes we get the error right but the wording could be improved:

Foo: static classes cannot be used as constraints

Why are they trying to use a static class as a constraint? Who knows? How should they fix it? Beats me! The best we can do is to tell them that it hurts when they try to do that. But the wording! Good heavens! Would you ever say "Pizza: delicious foods should be eaten while they're fresh!" ??? Clearly

Static class 'Foo' cannot be used as a constraint

is much better.

Anyone have additional suggestions for what makes a good error message? Or other examples of places where we got it wrong?

  • Oh no, I can see it now (with Politeness turned on):
    "file.cs(80) : I'm sorry <username> at line n, column i, you're missing a semicolon.  We regret any convenience this may have caused.".

    You know what would be outstandingly useful, is the ability to write an extension to process each error. That way tools can be written to automatically correct errors, or provide value-added messages.  Kind of like a error wiki...
  • I used to hate it when the Pascal compiler would complain about a missing semicolon and love it when the Turbo Pascal compiler would put it in for me when I forgot.

    The other thing is to indicate exactly where the error is occuring.
  • I can see your point for compiler messages which are in a very large context: the developer may litterally be doing anything.
    In ASP.NET, on the other hand, there are cases where we have a very precise idea of what's going on. Example: the application is trying to get the user profile and we find no profile provider. In this context, its'perfectly ok to prescribe a solution (define the profile provider in web.config). And that gets back to the politeness directive where you suggest what the solution might be, you don't say "you idiot forgot to configure the freaking profile provider". :)
  • The ever-interesting Eric Lippert has thoughtful post on how to create good error messages. His particular focus is on the error messages generated by a compiler, but there's very little in there, save perhaps the actual examples, that wouldn't apply
  • Q: "Anyone have additional suggestions for what makes a good error message?"

    A: Humor, but not at the price of the other qualities. For instance, Apple's MPW C had messages such as:

       ...And the lord said, 'lo, there shall only be case or default labels inside a switch statement
       This struct already has a perfectly good definition
       type in (cast) must be scalar; ANSI 3.3.4; page 39, lines 10-11 (I know you don't care, I'm just trying to annoy you)
       Too many errors on one line (make fewer)
       Symbol table full - fatal heap error; please go buy a RAM upgrade from your local Apple dealer
  • Unfortunately, if the user does not share the sense of humour of the developer, it can come across as arrogant, flippant or unprofessional, so we try to not go there.

    Another characteristic of good error messages that I forgot to mention above was "easily translated into foreign languages".  We provide many localized versions of the compiler and humour is hard to localize.

  • Static class 'Foo' cannot be used as a constraint

    is bad. Programmers tend to gloss over adjectives so they will read it as ".... Foo cannot be used as a constraint" with a predicatble WTF? reaction.
    A better option is

    class 'Foo' cannot be used as a constraint because it is static

  • Miles: Turbo Pascal put the semicolon in for you?  I've used Borland's Pascal products since TP 3.0, and that's never happened for me.  Maybe you're referring to the fact that semicolons are separators in Pascal, not terminators, which means that "begin foo; bar; baz end;" is legal in Pascal, whereas in C/C++/C# it would be "{ foo(); bar(); baz(); }".

    I strongly recommend putting the maximum effort into detecting when errors are cascading.  For example, if I forget to close a brace, say, then it should cause one error, the first time something appears that shouldn't be there, like an else or a  "...} while(foo);" that should have matched a "do {".  It shouldn't go on to cause hiccups because I appear to be defining methods inside other methods, or my variable scopes are now out of whack, or whatever.

    And yes, I realise the intelligence required for a compiler to figure this out is immense.  It may well start building sentient androids and sending them into the past to assassinate BillG as a boy.  But that's the price you have to pay sometimes...
  • >Start building sentient androids and sending them into the past to assassinate BillG as a boy

    What does Bill Gates have to do with writing error messages?
  • I am strongly against localizing error messages.

    Here’s why. When one sees an error message one doesn’t understand, one can copy-and-paste it into Google or a web discussion forum. If the message is in international English, one is very likely to find a solution or at least an explanation. If it is localized, then the search is only limited to that language. Asking for an explanation of a localized error message in an international forum is just plain wrong.
  • I actually prefer:

    Foo: static classes cannot be used as constraints


    Static class 'Foo' cannot be used as a constraint

    because, for beginning programmers, you're mentioning the rule and not the instance. It may not be clear to some why the static class 'Foo' cannot be used as a constraint. Perhaps it's a combination of things, but in either case the second one isn't unambiguous. It could be either the case where the adjective is the cause or the adjective is simply language used to further differentiate it from other classes. Language is funny like that.
  • Re: localizing error messages:  That's why every error message has a unique error number which can be looked up on MSDN for more information.  

    You have to look at the most common usage case.  The common case is non-english-speaking programmer gets an error, if the error is in their language then they stay productive, and if it isn't, then they're stuck.  Googling it in English is hardly going to help them!

    Your remark displays a considerable bias towards English -- I would say that if the forum is international then one should expect discussions in any language.  
  • I think prescriptive messages can be useful. There have certainly been times when I've gotten an error message and had to think really hard in order to fix my problem. There have also been times when I've gotten an error message that said something like "error blah...probably due to a whatever". Sure maybe the suggested solution is wrong but it at least gives you an initial way of attacking the issue.
  • With the connected nature of the IDE, I would find it helpful if you added an ability to visit an MSDN site that would not only provide error details, but also show what people were trying to do... kinda of like a for .NET errors --- people could return to their error tag when they get it fixed and update the resolution.  Of course the same "error" could have different resolutions, and that way the community at large would have opportunity to learn about techniques of other developers.

    Newsgroups are certainly convenient, but they are rather hit or miss, MVP support is a big help, but still not the answer.

    Page example:
    Error Msg: <foo>

    Description: <description>

    People were trying this:
    - <try1>
    - <try2>

    I am trying to do this: <myTry> <ADD BUTTON>
  • The MSDN wiki is an attempt to leverage the power of the community to annotate the documentation. Is that the kind of thing you're talking about?
Page 1 of 3 (35 items) 123