This morning, I was discussing issues around my previous post with Mads Torgersen, the PM who owns the C# compiler, and it reminded me of the next issue I wanted to talk about.  Something I'm pushing for in the Orcas timeframe is to add not only the Func<> types, but Proc<> types as well, which specify delegates that return void.  This and related types complete the set of standard delegate types for various APIs.  However, the very need for these extra times brings me to another issue with C#, which is a holdover from being a C-like language (and so, of course, this is a problem in C++ and Java as well).  That is the difference between a statement and an expression.

Now, this difference is probably a lot more intuitive for most programmers than the ones I was discussing last time, because nearly everyone who's ever done any kind of programming has done at least some coding in C, C++, Java, and C#.  All of these languages have these two constructs in their grammars, and the difference is woven into the very fabric of these languages.  We are taught to think of programs conceptually as a series of statements, where each statement modifies the state of the program's state in some way before the next statement executes (I'll limit the discussion to single-threaded applications for simplicity).  The reasoning for this is clear—this is what happens when our programs run.  The (classical) processor executes an instruction, which changes some register or writes some value to memory, then executes the next instruction.  The C language, which is essentially portable assembly, should obviously be a series of statements.

Unfortunately, even in C, we want to be able to use higher-level constructs than can be encoded in individual or even a small group of machine instructions.  This leads to the concept of an expression, which can be a part of a statement.  Now, subroutines can can "return" a value to be used by the caller instead of simply modifying explicitly declared state.  Obviously, this is a powerful concept, but it leads to an unfortunate inconsistency in the design of the language.  For exaple, compare (condition ? trueCase : falseCase) with (if (condition) { trueCase; } else { falseCase; }).  What is the difference here?  The expression "condition" is evaluated in both cases, exactly the same way.  If trueCase and falseCase are expressions, then there is no difference at all!  If condition is true, then trueCase is executed, else falseCase is executed.  Why are there two language constructs to express exactly the same thing?  I ask rhetorically, of course, because if-else expects "statements" while the ternary conditional operator expects "expressions."  Honestly, though, it seems silly that ? : can't contain blocks, and that if-else can't return a value.

Still, this leaves a problem.  Conceptually, a statement is simply an expression that returns no value.  Okay, but if every statement becomes an expression, how do you differentiate between a method with a return value and a method that returns void?  Even in this pretty, expression-based world, we may still want a method that simply modifies state and doesn't need to return a value, so why not make void an actual part of the type system?  A method can still return void, and you can still call it and ignore its return value.  Moreover, we wouldn't need separate Proc types; one could simply say Func<T, void> to get that Proc<T> (or Action<T>, if we're looking at Whidbey APIs).  Of course, this would make some silly things possible, like declaring a variable of type void, or having a List<void>, but I can't imagine why anyone would want to do that. :)

The very syntax of C#, Java, and other C-like languages makes this difficult, since statements are baked into the grammar, so we're stuck with void being its own special non-type.  So if you find yourself annoyed at the inconsistency of statements and expressions, take a break and write some code in Scheme or OCaml.  The exercise in functional thinking will do you good, and imho, it'll make you write better code in C#.