Null Is Not Empty

Null Is Not Empty

Rate This
  • Comments 33

Back when I started this blog in 2003, one of the first topics I posted on was the difference between Null, Empty and Nothing in VBScript. An excerpt:

Suppose you have a database of sales reports, and you ask the database "what was the total of all sales in August?" but one of the sales staff has not reported their sales for August yet. What's the correct answer? You could design the database to ignore the fact that data is missing and give the sum of the known sales, but that would be answering a different question. The question was not "what was the total of all known sales in August, excluding any missing data?" The question was "what was the total of all sales in August?" The answer to that question is "I don't know -- there is data missing", so the database returns Null.

This principle underlies the design of nullable value types in C#. The reason that we have nullable value types at all is because there is a semantic difference between the null integer/decimal/double/whatever and the zeroes of those types. A zero means “I know that the quantity is zero”, a null means “I don’t know what the quantity is”.

This also explains why nulls propagate; if you add two nullable ints and one of them is null then the answer is null. Clearly ten plus “I don’t know” equals “I don’t know”, not ten.

The concept of “null as missing information” also applies to reference types, which are of course always nullable. I am occasionally asked why C# does not simply treat null references passed to “foreach” as empty collections, or treat null strings as empty strings (*). It’s for the same reason as why we don’t treat null integers as zeroes. There is a semantic difference between “the collection of results is known to be empty” and “the collection of results could not even be determined in the first place”, and we want to allow you to preserve that distinction, not blur the line between them. By treating null as empty, we would diminish the value of being able to strongly distinguish between a missing or invalid collection and and present, valid, empty collection.

Now, if for some odd reason you do wish to treat null collections the same as empty collections, that’s easy enough to do. You can simply use the null coalescing operator; that’s what it’s for:

foreach(Customer customer in customers ?? Enumerable.Empty<Customer>())

The ?? operator means “use the left hand side, unless if the left hand side is null, use the right hand side.” Handy, that.

**************

(*) C# does treat null strings as empty strings when concatenating them. See the comments for a discussion of this fact.

  • " I am occasionally asked why C# does not simply treat null references passed to “foreach” as empty collections, or treat null strings as empty strings. "

    Except that C# does treat nulls as empty strings sometimes:

    Console.WriteLine("foo" + null + "bar");

    This prints foobar, it doesn't print an empty string nor does it throw an exception.

    Excellent point, I had forgotten that one. Which is odd, since rewriting the code generator that does those semantics was my first task when I joined this team.

    Those (in my opinion unfortuate) semantics are imposed upon us by String.Concat; the addition operator is just a syntactic sugar for a call to String.Concat. The designers of String.Concat chose to treat null concatenation as empty string concatenation. Which means that (string)null + (string)null gives you an empty string in C#, bizarrely enough. -- Eric

  • This is interesting. Years ago, a product call eMbedded Visual Basic, which used VBScript as its engine, had a constant of vbNullPtr that was used for API calls. Any idea what this value would pass?

  • What's the advantage of  your special empty sequence versus Linq's Enumerable.Empty<T>()?

    Good point. There is no advantage. It makes more sense to just use the standard one. I've updated the text. -- Eric

     

  • @ghenne : it would probably be equivalent to IntPtr.Zero...

  • Aww, a nostalgic post for me - I'm the same Blake from the comment thread on the original VBScript post.  

    Over five years later, and this has consistently been one of my favorite Microsoft blogs.   Thanks for all the great articles, Eric.

    You're welcome, thanks for reading! -- Eric

  • Great stuff Eric, I don't think people treat nulls as valuable information often enough. A few more thoughts: http://clipperhouse.com/blog/post/Nulls-and-knowledge.aspx

  • Thanks for another interesting post - I heartily agree that these are important distinctions for programmers to make.

    I also noticed Blake's comment, and went back through some of the discussion between you two from the original post. Time permitting, I would love to see a post or two about your ideas on interviewing, since I'm currently learning how to give interviews myself. How do you attempt to test problem solving ability? As I'm sure you've found, it seems a lot harder than just testing knowledge.

     

    I've written two articles about interviewing. See the "interviewing" archive button on the sidebar. -- Eric

  • I often see this manifested (or not manifested correctly) in data capture scenarios. It's all well and good to have strict validation, but sometimes your hapless user just doesn't know what the chassis serial number is, etc.

    Overzealous developers who shun nulls in the database end up at some point creating sentinel values which for obvious reasons doesn't make anything easier in the long run. Not only do you have a non-standard syntax, but you better be sure your sentinel is really never going to happen and doesn't ruin any computations in the process,

    Null is that special value that is outside the set of all permissible values, and I think sometimes people just think it's only the runtime scolding you from using an uninitialized reference.

    FYI To all the people that hate checking for null, you can always use the Null Object Pattern if it makes writing your domain code easier.

  • I'm pretty new to C#, but this has been nagging me for a bit

    why isn't int nullable in C#?

     

    Nullable ints are nullable in C#. Non-nullable ints are not nullable. This seems like a sensible approach, no? -- Eric

    why does it default to 0?

    Well, what value would you prefer a non-nullable int to default to? -- Eric

    int thisIsAnInt = null;

    throws an error on build

    The syntax for nullable value types in C# is to put a question mark after the type. Try "int? x = null;" -- Eric
  • (replying to myself) Actually it looks like you _have_ already posted some other stuff specifically about interviewing, which was very interesting. Thanks again for the great blog.

  • Thomas, you can do:

    int? thisIsAnInt = null; which is equivalent to

    Nullable<int> thisIsAnInt = null;

  • @Thomas

    int is a ValueType and null does not apply to ValueTypes. Eric recently wrote an article on ValueTypes and referrenced types. That article might be worth a little of your type. (Not that it actually answers your question but it's deducing from your questing I believe it holds valueable information for you).

  • I love the null coalescing operator. It's great for lazy initialisation:

    private List<Order> orders;

    public List<Order> Orders

    {

     get { return orders ?? (orders = LoadOrders()); }

    }

    I've got a post on my blog about making that thread-safe:

    http://blog.markrendle.net/post/Lazy-initialization-thread-safety-and-my-favourite-operator.aspx

  • Why the coalescing operator does not support shortened form, '??=' ?

    (and why there is not &&= and ||= as well?)

    The answer to every "why is feature X not implemented?" question is the same. No one designed, implemented, tested, documented or shipped that feature. You're the first person to ever ask me about this particular one in the last five years, so apparently no mob of angry programmers is banging down the door to building 41 demanding that we implement them. :-) Design, implementation, testing and documentation is expensive; we try to only implement features people actually want. -- Eric

     

  • I do not agree that a string is a reference type. Strings are values just as integers are. It just happens that the .NET architecture implements them as referenced objects. That's why you had to add the "IsNullOrEmpty" kludge. A proper string can never be null, in the same was as an integer can never be null. Unless I explicitly want it, in which case I would declare it as  "string? s", in the same way as I declare a nullable integer.

    Ordinary collections are not values (as they are mutable objects) but immutable collections definitely are values and should be non-nullable by default.

    I agree that immutable types are logically values, and it would have been nice to represent that in the type system. I also agree that it would have been nice to build in nullability/non-nullability from day one, instead of starting with non-nullable value types and nullable reference types, then adding nullable value types, and then never adding the fourth. The next time you design a brand-new type system, keep that in mind.

    But as a practical matter, I'm afraid strings are reference types, and that there are good reasons for that. The pleasant fact that value types are of known size, and need not be garbage collected makes it difficult to make strings value types. Also, the fact that strings can be cheaply copied by reference instead of copying all their bits, as we do with value types, is a big perf win. Would you rather abandon these benefits in exchange for making strings value types? What's the compelling benefit of making strings into value types that pays for the massive loss of performance that would entail? -- Eric

     

Page 1 of 3 (33 items) 123