Danger, Will Robinson!

Danger, Will Robinson!

Rate This
  • Comments 47

As long-time readers of this blog know, I am often asked why a particular hunk of bad-smelling code does not produce a compiler warning.

"Why not?" questions are inherently hard to answer because they turn causation on its head; normally we ask what caused a particular thing to happen, not what caused a particular thing to not happen. Therefore, rather than attack that question directly I like to rephrase the question into questions about the proposed feature. (A warning is of course a feature like any other.) By answering these questions we can see whether the compiler team is likely to spend its limited budget on the feature rather than prioritizing something else. Some questions I think about when considering "why did you not implement this warning?" are:

Did someone think of it?

If no one on the compiler team ever thinks of the possibility of a particular warning then obviously it won't happen. As an example of that, I was recently asked why:

using(null) statement;

does not produce a warning. That code is legal; it is equivalent to

IDisposable temp = null;
try
{
  statement;
}
finally
{
  if (temp != null) temp.Dispose();
}

Of course the code is completely pointless; all it does is introduce a try-finally block that adds nothing useful to the program. Why don't we give a warning for this useless code? Well, the primary reason is that (to my knowledge) no one on the compiler team ever thought that a customer would do such a thing, and therefore, never thought to design, implement, test and document a compiler feature to deal with the situation.

Now that we have thought of it, we can see if it meets any of our other criteria, like, is it plausible that a user would type this accidentally thinking that it does something sensible? But I'm getting ahead of myself.

Is the proposed warning even possible? Is it really expensive?

Sometimes people propose warnings that would require solving the Halting Problem (which is unsolvable in general) or solving an NP-hard problem (which is in practice unsolvable in a reasonable amount of time). Proposed warnings like "warn me if there is no possible way for this overload to be chosen" or "warn me if there is an input which guarantees that this recursive method has unbounded recursion" or "warn me if this argument can ever be null in this program" require a level of analysis that is either in principle impossible, or possible but better performed by tools custom made for that purpose, like the Code Contracts engine. Warning scenarios should be cheap for the compiler to disambiguate from non-warning scenarios.

Is the code being warned about both plausible and likely to be wrong? Is the potentially "wrong" code actually sensible in some scenario?

There are an infinite number of ways to write completely crazy code; the point of warnings is not to identify all possible crazy code, but rather to identify code that is plausible but likely to be wrong. We don't want to waste time and effort implementing features warning about ridiculous scenarios that in practice never arise.

Sometimes code seems to be both plausible and wrong, but is intentional for a reason that is not immediately obvious. The best example of this is our suppression of warnings about write-only locals; a local might be read by the developer during debugging even if it is never read in the program.

Machine-generated code is also often crazy-seeming but correct and intentional. It can be hard enough to write code generators that generate legal code; writing code generators that change their output when they determine that they're generating code that could produce warnings is burdensome. (For example, consider the machine-generated code that you get when you create a new project in Visual Studio. It had better compile without warnings!) However, we are much more concerned with catching errors in code written by humans than we are about making machines happy.

Obviously we only want to warn about code that has a high likelihood of actually being wrong. If we warn about correct code then we are encouraging users to change correct code, probably by changing it into incorrect code. A warning should be a bug-preventing feature, not a bug-encouraging feature. This leads me to my next point:

Is there a clear way to rewrite correct code that gets an unwanted warning into correct code that does not get a warning?

Warnings ideally ought to be easily turn-off-able. For example:

bool x = N();
...
if (x == null ) Q();

That gives a warning that == null on a non-nullable value type is always false. Clearly you can rewrite the code to eliminate the warning; if you intended it to always be false then get rid of the whole statement, or you can turn the variable into a nullable bool, or whatever.

A warning where there is no way to write the code so that the warning disappears is very irritating to people who actually do mean the seemingly-wrong thing. You can always turn a warning off with a #pragma of course, but that's a horribly ugly thing to force someone to do.

Will a new warning turn large amounts of existing code into errors?

Lots of people compile with "warnings as errors" turned on. Every time we add a warning that causes a lot of existing code to produce warnings, we potentially break a lot of people. Even if the warnings are good ones, breaking a lot of people is points against doing the feature. Such warnings are perhaps best handled in another tool, which brings me to:

Does the compiler team have to do this work? Or can another tool produce the warning?

Microsoft provides tools like FxCop, StyleCop and Code Contracts to do far more extensive analysis of code than the C# compiler performs. Similarly with third-party tools like ReSharper. If a warning can be implemented by one of those tools just as easily or more easily than it can be implemented in the compiler then that's less work for me, and that's awesome. Also, since FxCop examines compiled state, a warning in FxCop can catch problems in C# and VB, without having to change either compiler.

Will the user discover the error immediately upon testing anyway?

The earlier a bug is discovered, the cheaper it is to fix. The opportunities to find bugs are, in order of exponentially increasing cost of fixing the bug:

* When you type in the code
* When you compile the code
* When you test the code
* When your customer runs the code

Warnings about potential bugs that pop up while you are typing the code or compiling the code are bugs that never make it to the customer. However, a potential warning that points out a bug that would always be caught by testing is less compelling than a warning for a bug that could make it through testing to the customer.

For example, I was recently asked why the compiler does not provide a warning for this common typing error:

private decimal cost;
public decimal Cost { get { return this.Cost; } }

Whoops, that's a stack overflow exception (or, on machines with tail recursion optimizations, an infinite loop) waiting to happen right there. The compiler can in principle determine cheaply that property getter does nothing other than call itself, so why doesn't the compiler warn about it? We could do all the work of identifying code that has this pattern, but why bother warning about a bug that will be caught anyway? The instant you test this code you will immediately discover the problem and it will be clear how to fix it. This is unlike our earlier example of "if (x == null)", where the fact that there is unreachable code might go completely unnoticed during testing, particularly if the common scenario is for x to be false.

Summing up:

Since that is quite the gauntlet for a feature to run, we find that usually the right thing to do for a proposed warning is "don't add the warning". The most likely situation for lots of new warnings to be added is when the warnings are surrounding a new feature that has new ways of being misused; we added quite a few warnings in C# 4.0 for misuses of "dynamic", "no PIA", and "named and optional arguments". Warnings for new features have no existing code using them that would break.

That said, we can always improve; if you have ideas for warnings do keep them coming.

  • "Developers who are so unprofessional as to deploy code without running it also ignore compiler warnings. -- Eric"

    The warning could be very useful if the organization compiles with "warnings as errors" turned on. So even though the developer is lazy and deserves it, the organization is still spared the bug.

  • Two points:

    1. You don't have to break existing code when you introduce new warnings if you change your warning level to include some indication of the compiler version. E.G. Treat warnings as errors: 'Those introduced in VS2010 and earlier' etc

    2. The recursive property/stack blowing mistake can be expensive to find - especially if the only symptom you have is some WCF service going down. Hours. Yet it is so cheap for you to spot. I have never understood why this does not meet the cost/benefit tradeoff.

  • "(For example, consider the machine-generated code that you get when you create a new project in Visual Studio. It had better compile without warnings!)"

    Then why do the templates compile with warnings if I have "XML documentation file" turned on?

  • @pete.d

    The difference is that C++ is at a whole different level of complexity than C# is. There is also the fact that we C++ devs can use the code is compiling as an excuse to have a nice long coffee break and the boss won't know any different.

  • @Ted: "Add as much static code analysis as possible"

    100% Correct...The falacy is that the compiler is the appropriate place.

    @Yitzhak Steinmetz: "The warning could be very useful if the organization compiles with "warnings as errors" turned on. So even though the developer is lazy and deserves it, the organization is still spared the bug."

    Once again, it is not necessary to do this in the compiler. Simply use Gated-Checkins, and make sure all of the desired analysis is performed. If anything does not pass, it can notget into the Source Repository.

  • "especially if the only symptom you have is some WCF service going down."

    I don't understand this comment. I have found WCF's trace information to be very useful in tracking down all sorts of weird problems. A stack overflow exception, which would result in the exception and stack trace being emitted in the log (either the WCF trace output itself, or at the very least the event log), should be very easy to track down.

    In any case, the fact is that debugging services is difficult generally. Logging helps quite a lot, but this is yet another example of "it's not the compiler's job". We need better tools for debugging services, not ad hoc features glommed onto the compiler.

    "The difference is that C++ is at a whole different level of complexity than C# is. There is also the fact that we C++ devs can use the code is compiling as an excuse to have a nice long coffee break and the boss won't know any different."

    I can't tell if you're being facetious or not. C++ certainly is _not_ "a whole different level of complexity" from C# and obviously your second point is no way to design compilers, so it must be tongue-in-cheek. Do you _really_ believe there's some excuse for C++ code to take so much longer to compile than C# code?

  • "no really, as far as the compiler is concerned, it is."

    No, really, it's not.

    C# has to deal with generics, which the C++ compiler does not, assuming it's not compiling managed code. And templates are really just glorified preprocessing, no more complex than a macro.

    But even if templates added to the complexity in some significant way, it's not a relevant distinction if I'm not compiling code with templates in it.

    More to the point: regardless of the balance of performance cost of various features in the C++ compiler, the fact remains that compiling C++ code takes WAY longer than comparable C# code (not to mention that I have never crashed the C# compiler, but have done so on a regular basis with the C++ compiler). For warnings that do nothing more than detect something that would be detected the instant I actually run the code, it's a waste of time and code complexity in the compiler.

  • C++ templates are compiled for each type they are instantiated for. .NET generics are compiled once with a placeholder. There is a difference there. The only way to get C++ to take a real long time to compile is by using templates a lot, and if you use the standard C++ library then you are using templates a lot. Link time code generation also can add to the build time too, but again, it is worth it for the speed boost.

    But in cost VS performance, having long running codepaths complete in 290ms rather than 850ms is the kind of thing that C++ developers want. So we don't mind the extra time spent on compiling.

  • >> templates are really just glorified preprocessing, no more complex than a macro.

    This is misleading in general because of things like two-phase lookup (which, granted, VC++ in particular does not implement).

  • "This is misleading in general because of things like two-phase lookup"

    It's true, I am over-simplifying a bit. C++ templates have made significant progress since they originally appeared. But inasmuch as two-phase lookup is mainly a way to provide some validation for a template during declaration rather than instantiation, I don't feel it changes the fundamental nature of templates.

    Fact is, a) I don't find "templates, templates, templates" a valid excuse for why C++ compilation is so much slower than C# compilation (especially in scenarios where templates aren't even being used), and b) regardless of why the C++ compiler is so slow, I don't care for the _C#_ compiler to be bogged down with warnings that detect things that will be found the first time the code is executed.

    I apologize if I've offended all the C++ apologists out there. Let's just assume for the sake of argument that the C++ compiler is blindingly fast considering all the incredibly complex and unique features it has to support. Fine by me. I still don't want the C# compiler being dragged down by warnings that aren't really improving things that much.

  • C++ is slow to compile because most C++ programs just have more code to parse. Odds are, any typical C++ file has tens or hundreds of thousands of lines of code in header files to parse. See stackoverflow.com/.../318440 for many more reasons.

    Still, it's pretty ridiculous to say that too many warnings are somehow bogging down the compiler. That's like saying your car's radio is bogging down your car. If you don't like them, just turn them off! Out of all the things to be "burdened" with, that's one of the most trivial.

    Of course, just as you can put a sound system in your car that uses so much power that it requires auxilliary batteries, you could also create warnings that are difficult to compute. But the whole point of warnings is that they're low-hanging fruit, discovered in the compilation process. For example, the compiler *already* has to make sure that all code paths in a function return a value or throw an exception. Checking to see that at least one of those code paths doesn't result in a call to that function isn't much more work.

  • Doesn't  an empty try..finally block stop Thread.Abort interrupting the current thread?  

    Perhaps, someone adverse to writing try..finally blocks but wants the thread to not be interrupted could use this syntax?? :p

  • I'd like Visual Studio to warn about a simple property getter or setter that's self recursive as soon as I type it.

  • @Matt: First of all, no it doesn't. Second, why would you avoid writing try/finally blocks unless you're purposefully writing bad code?

  • @configurator I was pretty sure that as of CLR 2.0 a call from another thread to Thread.Abort would not interrupt a try finally block from executing, must go find reference document...

    The second part was was me being facetious :p

Page 3 of 4 (47 items) 1234