Should C# warn on null dereference?

Should C# warn on null dereference?

Rate This
  • Comments 35

As you probably know, the C# compiler does flow analysis on constants for the purposes of finding unreachable code. In this method the statement with the calls is known to be unreachable, and the compiler warns about it.

const object x = null;
void Foo()
{
  if (x != null)
  {
    Console.WriteLine(x.GetHashCode());
  }
}

Now suppose we removed the "if" statement and just had the call.

const object x = null;
void Foo()
{
  Console.WriteLine(x.GetHashCode());
}

The compiler does not warn you that you're dereferencing null! The question is, as usual, why not?

The answer is, as usual, that we do not have to provide a justification for not doing a feature. Features cost money, time and effort, and take away money, time and effort from features that would benefit the user better, so features have to be justified based on a cost-vs-benefits analysis. So let's think about that.

Something I find helpful when thinking about a particular feature is to ask "is this a specific case of a more general feature?" The proposed feature is essentially to detect that a particular exception is always going to be thrown. Do we want to in general be able to detect cases where an exception will always be thrown and warn about them? Well, doing so with certainty is equivalent to solving the Halting Problem, as we've already discussed. But even without that, we do not want to give a warning every time we know that an exception must be thrown; that would then make this program fragment produce a warning:

int M()
{
  throw new NotImplementedException();
}

The whole point of throwing that exception in the first place is to make it clear that this part of the program doesn't work; giving a warning saying that it doesn't work is counterproductive; warnings should tell you things you don't already know.

So let's just think about the feature of detecting when a constant null is dereferenced. How often does that happen, anyway? I occasionally make null constants; perhaps because I want to be able to say things like "if (symbol == InvalidSymbol)" where InvalidSymbol is the constant null; it makes the code read more like English. And maybe someone could accidentally say "InvalidSymbol.Name" and the compiler could warn them that they are dereferencing null.

We've been here before; in fact, I made a list. So let's go down it.

The feature is reasonable; the code seems plausible, it is clearly wrong, and we could detect that without too much expense. However, the scope of this warning is very small; the vast majority of the time I've used null constants, I've used them for comparison, not for accessing members off of them. And the problem will be detected when we test the code, every time.

Could we perhaps then generalize the feature in a different way? Perhaps instead of constants we should detect any time that the compiler can reasonably know that any dereference is probably a null dereference. Solving the problem perfectly is, again, equivalent to solving the Halting Problem, which we know is impossible, but we can use some clever heuristics to do a good job.

In fact the C# compiler already has some of those heuristics; it uses them in its nullable arithmetic optimizer. If we can statically know that a nullable expression is always null or never null then we can do a better job of codegen. (And how we do so would be a great future blog topic.) However, the existing heuristics are extremely "local"; they do not, for example, notice that a local variable was assigned null and then later used before it was reassigned. So again, the scope of this warning would be very small, probably too small.

To do a good job of the proposed more general feature, we'd want to modify the existing flow analyzer that determines if a local variable is definitely assigned before it is used, with one that also can tell you whether the value assigned was non-null before a dereference. That's a much more expensive feature; the benefits are higher, but so are the costs.

What it really comes down to me for this feature is that last item on my list. Yes, it is always better to catch a bug at compile time, but that said, null dereference bugs that the compiler could have told you about with certainty are the easiest ones to catch at runtime because they always happen the moment you test the code. So that's some points against the feature.

So basically the feature request here is to write a somewhat expensive detector that detects at compile time a somewhat unlikely condition that will always be immediately found the first time the code is run anyways. It is therefore not a great candidate for spending budget on to implement it; thus the feature has never been implemented. It's a lovely feature and I'd be happy to have it in the compiler, but it's just not big enough bang for the design, development and testing buck, and we have many higher priorities.

Now, you might note that tools like the Code Contracts feature that shipped with version 4, or Resharper, or other similar tools, all have various abilities to statically detect possible or guaranteed null dereferences. Which proves that it is possible to do a good job of this feature, and that's good to know. But that is also points against doing the warning in the compiler. As I point out on my list, if an existing analysis tool does a good job, then why replicate that work in the compiler? The compiler does not aim to be the be-all and end-all of code analysis; in fact, we are building Roslyn precisely to make it easier for third parties to develop such analysis tools. We can't possibly do every great code analysis feature, but we can make it easier for you to do so!

 

  • Oh how I wish Code Contracts was actually included with .NET 4, but alas, only the declarative method and attribute stubs are included, which by themselves are useless (or worse than useless, they'll throw exceptions, sometimes descriptive and sometimes an ExecutionEngineException). I hope that the compilers included in vNext (.NET 5.0?) will include Code Contracts IL weaving functionality, or perhaps even a general IL weaving extensibility point for both Code Contracts and PostSharp & Friends (a problem Roslyn doesn't, nor purports, to solve). Well, at least until better metaprogramming support is added to the language which would make those kind tools obsolete.

    And I wholeheartedly agree that the compiler should not strive to be the ultimate static code analyzer. It should emit warnings about things it has already figured during the code analysis it does as part of its main job - compiling! If it's not directly related to the job at hand, it shouldn't go out of its way to perform extra analysis. Sure, it is already intimately familiar with all the language's intricacies, but it's doesn't mean it has the means nor the mandate to  analyze that added complexity, make the 'fuzzy' decisions about whether it's 'correct' or not, and effectively convey it to the user (whilst taking into account his preferred warning verbosity level and coding style).

    Static analysis is much better off with a dedicated tool such as StyleCop/FxCop, ReSharper, CC, etc. If they can all share a single 'language layer' in the form of Roslyn, all the better - those who write these tools will be more productive and will have more time left  to write better static code analyzers!

    (Is this a double post? If so, it's the blog system's fault.)

  • This reminds me of an answer you gave me on Stackoverflow : stackoverflow.com/.../1796 Any change on that?

  • I think that in the future - especially with Roslyn - the C# Compiler could tie in closer with Code Analysis frameworks and offer extension points. For example, NotImplementedException shouldn't cause warnings in Debug Builds, but I'd love them to cause Warnings or even Errors in Release Builds. Obviously, having stuff like that configurable bloats the core compiler unnecessary and that specific case can be solved by existing Code Analysis tools, but I wonder what people could come up with if they build a framework around the compiler rather than having to treat it as a black box that either completely succeeds or completely fails.

  • >The answer is, as usual, that we do not have to provide a justification for not doing a feature. Features cost money, time and effort, and take away money

    I use Visual Studio daily in my job and occasionally in my hobbies, but the number of features in Eclipse a free IDE but not in an IDE that costs $1,200 per license is astounding to me.  Features cost money, but lack of features and hesitation to provide functionality has an opportunity cost as well.

  • @Colton E.

    Visual Studio and the C# Compiler are two entirely separate product.  One simply happens to use the other.

    Visual Stuido (other than the express version, which is free) does cost money, but C#, and it's associated compilers are entirely free.  You're complaining to the wrong people if you don't like some aspect of the IDE that happens to utilize a language developed (in part) by the writer of this blog.

  • As Allon Said:

    JetBrains' Resharper tool will identify every place in your code that has the possibility of a null reference error, allowing you to put in a null check.

  • I agree, that kind of warning is not going to be very useful (in the grander scheme of things). I'd much rather have a type system that can express non-null references. Or alternatively an option type (à la F#, Haskell, Scala) to force the clients of my code to handle the case where the value is missing.

    In the meantime, I use ReSharper's `Null` and and `NotNull` annotations to help catch illegal uses of APIs at compile-time.

  • @Allon Guralnek: Glad I'm not the only one whose first thought on reading this was Code Contracts. :)  Those tools are invaluable for my work, but they're oh-so-slow and limited.  A full-fledged CC (maybe built on Roslyn?) is something that I would really pay good money for, it's that important to me.

  • @SealedSun.ch: I believe I saw a MS Research paper about possibility of including non-nullable references in C#. It dates back to the C# 1.0 days (even before the generics!), and I guess the fact that it's still not implemented says something.

  • Would this make it even harder to write a recursive anonymous method? I've always found it slightly irritating that this:

       Func<int, int> fib = i => i < 2 ? 1 : fib(i - 1) + fib(i - 2);

    seems to fall foul of the definite assignment rules. I'm assuming the argument for why it fails is that fib is not assigned until this statement has finished executing, so you're not allowed to use fib inside the statement, and the code in the delegate is counted as being "inside the statement" even though in this particular case it cannot be run until after the statement completes. (I know there are scenarios where you could create and invoke a lambda in the statements, so I'm guessing that's why this example is not allowed.)

    My usual workaround looks like this:

       Func<int, int> fib = null;

       fib = i => i < 2 ? 1 : fib(i - 1) + fib(i - 2);

    But the definitely-not-null analysis you're describing seems like it would also reject this. (R# already complains about this, but at least it's not a warning.) So that seems like another point against.

    Or is there a better way to deal with this problem? Am I using the wrong workaround here?

  • What you should ideally do is change the language so that T var cannot be null, and you need to say T? var for it to be nullable.

    Unfortunately, that ship has sailed, although maybe you can introduce T! var for a non-nullable reference.

  • Resharper is a bloatware. I can't use it without a crash. Please integrate these kinds of goodies as much as possible!

  • It appears my comment got eaten.

    I use Resharper too and it is great, but there are a few places it gets this wrong and warns about a constant null expression which is in fact not always null. I hate having to supress warnings on things that aren't really wrong (or fiddle with the code for no reason other than making the message go away), so I would not want to see a warning for this.

  • @Bob: Resharper isn't bloatware at all - it's pretty quick and, in my experience, stable.

    But let's humour you, and continue with your request.

    If the compiler should warn about null dereferencing, should it also warn about 'throw ex;' statements too? What about the code namespace not matching the file location? Closure over a modified variable? Unused 'using' statements at the top of a code file? I could go on.

    If all this is added to the compiler, then compiler performance suffers drastically, and the language stagnates, as there's no time to develop new features any more.

    The compiler has one job, and one job alone: take source code and produce object code. It has no business enforcing styles or conventions - that's the purpose of tools like Resharper.

  • @IanG: If you want to write recursive anonymous methods, you should definitely check out blogs.msdn.com/.../anonymous-recursion-in-c.aspx – you might be a bit surprised the situation is not that simple. :-)

Page 1 of 3 (35 items) 123