Should C# warn on null dereference?

Should C# warn on null dereference?

Rate This
  • Comments 35

As you probably know, the C# compiler does flow analysis on constants for the purposes of finding unreachable code. In this method the statement with the calls is known to be unreachable, and the compiler warns about it.

const object x = null;
void Foo()
{
  if (x != null)
  {
    Console.WriteLine(x.GetHashCode());
  }
}

Now suppose we removed the "if" statement and just had the call.

const object x = null;
void Foo()
{
  Console.WriteLine(x.GetHashCode());
}

The compiler does not warn you that you're dereferencing null! The question is, as usual, why not?

The answer is, as usual, that we do not have to provide a justification for not doing a feature. Features cost money, time and effort, and take away money, time and effort from features that would benefit the user better, so features have to be justified based on a cost-vs-benefits analysis. So let's think about that.

Something I find helpful when thinking about a particular feature is to ask "is this a specific case of a more general feature?" The proposed feature is essentially to detect that a particular exception is always going to be thrown. Do we want to in general be able to detect cases where an exception will always be thrown and warn about them? Well, doing so with certainty is equivalent to solving the Halting Problem, as we've already discussed. But even without that, we do not want to give a warning every time we know that an exception must be thrown; that would then make this program fragment produce a warning:

int M()
{
  throw new NotImplementedException();
}

The whole point of throwing that exception in the first place is to make it clear that this part of the program doesn't work; giving a warning saying that it doesn't work is counterproductive; warnings should tell you things you don't already know.

So let's just think about the feature of detecting when a constant null is dereferenced. How often does that happen, anyway? I occasionally make null constants; perhaps because I want to be able to say things like "if (symbol == InvalidSymbol)" where InvalidSymbol is the constant null; it makes the code read more like English. And maybe someone could accidentally say "InvalidSymbol.Name" and the compiler could warn them that they are dereferencing null.

We've been here before; in fact, I made a list. So let's go down it.

The feature is reasonable; the code seems plausible, it is clearly wrong, and we could detect that without too much expense. However, the scope of this warning is very small; the vast majority of the time I've used null constants, I've used them for comparison, not for accessing members off of them. And the problem will be detected when we test the code, every time.

Could we perhaps then generalize the feature in a different way? Perhaps instead of constants we should detect any time that the compiler can reasonably know that any dereference is probably a null dereference. Solving the problem perfectly is, again, equivalent to solving the Halting Problem, which we know is impossible, but we can use some clever heuristics to do a good job.

In fact the C# compiler already has some of those heuristics; it uses them in its nullable arithmetic optimizer. If we can statically know that a nullable expression is always null or never null then we can do a better job of codegen. (And how we do so would be a great future blog topic.) However, the existing heuristics are extremely "local"; they do not, for example, notice that a local variable was assigned null and then later used before it was reassigned. So again, the scope of this warning would be very small, probably too small.

To do a good job of the proposed more general feature, we'd want to modify the existing flow analyzer that determines if a local variable is definitely assigned before it is used, with one that also can tell you whether the value assigned was non-null before a dereference. That's a much more expensive feature; the benefits are higher, but so are the costs.

What it really comes down to me for this feature is that last item on my list. Yes, it is always better to catch a bug at compile time, but that said, null dereference bugs that the compiler could have told you about with certainty are the easiest ones to catch at runtime because they always happen the moment you test the code. So that's some points against the feature.

So basically the feature request here is to write a somewhat expensive detector that detects at compile time a somewhat unlikely condition that will always be immediately found the first time the code is run anyways. It is therefore not a great candidate for spending budget on to implement it; thus the feature has never been implemented. It's a lovely feature and I'd be happy to have it in the compiler, but it's just not big enough bang for the design, development and testing buck, and we have many higher priorities.

Now, you might note that tools like the Code Contracts feature that shipped with version 4, or Resharper, or other similar tools, all have various abilities to statically detect possible or guaranteed null dereferences. Which proves that it is possible to do a good job of this feature, and that's good to know. But that is also points against doing the warning in the compiler. As I point out on my list, if an existing analysis tool does a good job, then why replicate that work in the compiler? The compiler does not aim to be the be-all and end-all of code analysis; in fact, we are building Roslyn precisely to make it easier for third parties to develop such analysis tools. We can't possibly do every great code analysis feature, but we can make it easier for you to do so!

 

  • Okay, for all folks who want to leave "You must put non-nullable references in C# 5.0!"-like comment, here is the article I was refering to:

    Manuel Fähndrich, K. Rustan M. Leino. Declaring and Checking Non-null Types in an Object-Oriented Language. research.microsoft.com/.../non-null.pdf

    Paragraph 7, "Implementation" states that this goal can be achieved with a CIL-level checking tool, and I guess someone must have already written such a plug-in for VS.

    Sure, NULLs are more ruinous than beneficious, but Hoare had already invented them, and most of industrial-strength languages had embraced them. Well, make your own Z# language, without ugly C-like syntax, without nulls and other stupid things, or cope with what you have.

  • Joker, it is another subject, but we can always try to fix errors from the past, no?

    qconlondon.com/.../Null+References:+The+Billion+Dollar+Mistake

  • @Petr Kadlec - I probably shouldn't have said anything about recursion. It was just a short way of talking about referring to the delegate from within the delegate. The situation in which I have most often hit this is in fact when writing an event handler that unsubscribes itself. The Y combinator (which I was already familiar with, but thanks for the link) is definitely overcomplicating things.

    So, what I should have said is I'd like to be able to write this:

    EventHandler h = (s,e) =>

    {

     src.SomeEvent -= h;  // Not allowed, boo hiss!

     ... do stuff

    };

    src.SomeEvent += h;

    Well, I say 'like'. Really, I'd prefer to use something more suitable like a Task but when events are all that's offered, this is where you can end up.

  • I'm afraid that nulls are actually beneficial enough that were they not to exist, they would be invented. After all, we already had non-nullable types in C# just long enough for the designers to realize that they had to make nullable versions of them.

    Before nullable value types were added to C# there were ways to make values nullable, but they were all ad-hoc: databases had DbNull, XML had *Specified, and others used default(T). With all the ad-hoc methods, there was no way for the compiler to give warnings or have lifted operations and the runtime couldn't raise exceptions when you accessed a roll-your-own null.

    As for nullable references, I don't see how a language that needs to interoperate with the outside world in a natural manner can do without them. If you designed a language with non-nullable references (like F#), you'd still have to have the concept of null because you could get one from (or have to pass one to) a module written in VB.

  • @Gabe: The main reason for creating Nullable<T> was better DBMS support (yeah, LINQ). And there are well-renowned relational theory experts who believe that introducing NULL into the relational data model was a grave mistake—besides, the relational theory had been doing pretty well without NULL for 10 years (1969-1979), and relations DBMS built during that period didn't have any NULL concept. The main motivation for NULL's introducing was solving the Missing Information Problem, and it failed to solve this problem in a satisfactory way: new very strong problems have appeared, and the MIP itself wasn't quite resolved.

    And since there are relational DBMS that has no NULL support, which means I can conjugate my C# program with them without using Nullable<T> and all this "x ?? x : default(X)" stuff. Isn't it great, y/n?

    Back on topic: many programmers expect that null dereferencing is checked at run-time, not compile-time anyway. And if you want reason about nullability of your variables, there are already plenty of tools for it. Still, I would like to see non-nullable references, but since the CLR is full of null-returning methods... such references won't be as useful as they could be.

  • I think the real question is should we make the compiler so intuitive and complex that it takes away from the

    other more simple features that will contribute to effective programming. At this point we may not need it but going forward  C# compiler can check for null type references to reduce the "obvious" steps in the code ( similar to closing connections or finally clause after try/catch )

  • Also on the topic of nullable references... People that have played with Spec# for instance know them. Non-nullable references, with a "grace period" for initialization, are a great thing to have. Oh, talking of Spec# and Code Contracts, people interested in those may also like the thesis recently published on that topic:

    e-collection.library.ethz.ch/.../eth-5609-01.pdf

  • @RaceProUK It is my opinion that "throw ex;" falls under the category of "accidentally doing the wrong thing". I can imagine no scenario where it is desirable to strip the originating stack trace from an exception. It is almost certainly the wrong thing to do and very easy for the uninformed to do it.

    The problem of course, is that if you make it an error or a warning, there has to be some syntax to allow you to do it, just in case there really is a valid reason out there. Maybe "throw explicit ex" (in an attempt to not create a new keyword)?

  • The compiler does not need to warn on dereferencing nulls. But a warning on all uppercase menu captions would be nice.

  • Is it possible to - atleast display -what object is null in the expression- as part of the exception message - when null reference exception thrown?

  • @Douglas: 'throw ex;' shouldn't throw a warning - it'll never fail at runtime. Besides, there are situations where you do want to do this, for instance in a library where you don't want to expose the internals to a third party. This makes it a question of style, not a question of correctness.

  • I would argue that if you're hiding your internals then you do not want to throw the originating exception, but a different exception entirely.

  • Joker: So 1979 must have been the year that they finally decided 9/9/99 wasn't going to be feasible very much longer for encoding "unknown date" and that it would be necessary to implement an out-of-band way to store the fact the data was unknown.

    While nulls may cause problems, they solve far more problems than they cause. In other words, while non-nullable references might be nice to have, nullable value types are far more valuable. In particular, without nullable objects, the compiler wouldn't even have a chance to warn you about a null dereference.

  • Hi there,

    Is there a way I can get a hold of you to use some of the contents on your blog. I work for an IT recruitment company and we have a niche newsletter aimed at C# developers and find that the contents on your blog might appeal to them.

    My email address is mmotsepe@insource.co.za or you can visit our website which is www.insource.co.za

  • @@RaceProUK: Two parts to my reply.

    Part 1: Counter argument - InvalidOperationException, among others. You may want to catch and log/report this, then rethrow it without exposing your internals.

    Part 2: What's your real fake name? :)

Page 2 of 3 (35 items) 123