'switch' in C#

Published 15 November 04 06:02 AM

On Eric's blog, a discussion about 'switch' statements in C# & why they require 'break' inspired this post.

One of my favorite principles in the design of C# is that it forces you to be explicit when that removes confusion.  The best example is the way that the language doesn't let you accidentally override a base method.  If you don't explicitly say what you want, the compiler tells you about it.

In this case, the language is forcing you to be explicit again.  Either you're done ("break") or you want to fall through ("goto").  You gotta say which one.


I don't worry too much about the syntax of 'switch' when I code. Instead, I try to have as few 'switch' statements as possible.  

The OO way to do things is to use polymorphism to manage differences between cases, not a big decision statement.  As soon as I find that I'm switching on the same thing twice, it's time to consider polymorphism.  See ReplaceTypeCode*.

And the procedural way is to give each option a name (make them new methods).  If I have a switch where the bodies of the cases get even slightly interesting, the whole switch gets out of control.  So, at the very least, I want to Extract Method on each body.

In the ideal, I have relatively few switches with only trivial content.

 

We did something in Whidbey to help switch lovers.  We’ll generate cases from an enum.

1.    Type ‘switch’.

2.    Hit TAB. 
This will expand a code snippet with the skeleton of a switch statement.  The expression to switch on will be highlighted as an entry field.

3.    Supply your favorite enum-typed variable.

4.    ENTER (to commit the expansion)

We spit out a ‘case’ block for each value in the enum.

 

We could have done more.  We could have automatically spit out ‘break;’ as soon as you type case.  I think that would have made people’s concerns about the superfluous ‘break’ in C# go away.  When the editor takes care of it for you, it’s not a big deal.  Maybe next time.

In general, I think it’s important that the language designers think about the editing context.  If you intend your language to be edited “dumb” editor (like notepad), a semi-intelligent one (emacs, slickedit), or a very intelligent one (VS, eclipse), you would design differently.  If you assume certain facilities in the editor, you can take advantage of them to make the language easier to read & easier to write.

Comments

# Radu Grigore said on November 15, 2004 6:54 AM:
The "be explicit" principle would have lead to:

void f()
{
DoSomething();
return; // required
}

... which is not bad, but it is inconsistent with "break" being required in switches. This is the first problem.

The second problem is that breaking from a switch is different from breaking from a loop and the usage of the same keyword can be annoying.

In short, it seems that compatibility with existing languages was more important in the design of this feature than the "be explicit" principle or any other logical guideline.
# Uwe Keim said on November 15, 2004 7:10 AM:
At least the designer of C# didn't force us to explicitely declare local variables as in Pascal... ;-)
# Thomas Eyde said on November 15, 2004 7:27 AM:
Radu has a good argument. Playing further on this, we should have:

foreach(string s in strings)
{
//do something...
continue; //no, I don't want to break.
}

Why a missing break in a switch should confuse anyone, confuses me.
# Sam said on November 15, 2004 7:36 AM:
A warning about a case-fallthrough would have been enough, yes - but demanding a break is ok for me, too.
# jaybaz [MS] said on November 15, 2004 8:18 AM:
Radu: I disagree. The end of the function *always* does a return. The end of a case-block either does a break or a goto, there's a choice to declare.

Uwe: C# doesn't require you to declare your locals?
# RebelGeekz said on November 15, 2004 8:44 AM:
One simple explanation for a bad design:
99.9999% of the cases won't be fall-through.
Poll the world.
# Jeff Parker said on November 15, 2004 8:55 AM:
I have to agree with you jaybaz, the reason I love the break in the switch statement is it explicitly tells you where you want it to go. It increases readability.

One thing i do not like about the switch statement and break is the compiler warning from it if you are issuing a return in the case. The break gets flagged as unreachable code. Which yeah it is. But still to be consistant it should be there.

For Example

switch (x)
{
case 1:
return something;
break; // <Will get flagged by compiler and issue warning as unreachable code.
case 2:
somethingElse = something;
break; // is not flagged and is valid
}
return somethingElse
# Radu Grigore said on November 15, 2004 9:11 AM:
Jaybaz: As I see it, it is entierly analoguous:

void f()
{
DoF();
return; // break;
}

void g()
{
DoG();
return f(); // goto f
}

But for void functions there is a default behaviour: say nothing and reach end of block means it's done processing. For switch there is no default.

NOTE: the "return f()" line can appear in the middle of some processing in g(): it will still be analoguous to "goto".
# Radu Grigore said on November 15, 2004 9:12 AM:
Ah.. I forgot to mention: IF fallthrough would have been an option for switch then my analogy breaks.
# Omer van Kloeten's .NET Zen said on November 15, 2004 1:02 PM:
# Thomas Eyde said on November 15, 2004 11:47 AM:
The C# team had their freedom to define whatever a switch should mean. So stating there is no default in a switch is somewhat wrong. They could define a switch to mean: select a block, run it and that's it. The case statement could be extended to accept multiple parameters, event ranges. The goto shouldn't be there; extract the common code to a method and call it.

The switch would be more C#-ish this way:

switch(n)
{
case(1)
{
//do case 1
}
case(2,3,4)
{
//you get the idea
}
default()
{
}
} // end switch

I still think that requiring a nop keyword is just stupid.
# AT said on November 15, 2004 12:43 PM:
Radu, you were able to express your opinion once and now started to repeat.

Actually - there is not absolutely correct answers in our world. Very often you need to use some tradeoffs.
"useless statement" vs. compatibility with others languages and readability.

For example you have provided an example about void g(){} void f(){ return g(); } . Why this does not compile?
Why do you need to add a special case in language for "void" return value?
Not been able to return void can lead to problems if you will do refactoring and change return value of g(). This kind of refactoring error will be undetected. Not good.

Overall - my opinion that we must specify as much as possible inside language and make it easy to read for everybody - not only for computer geeks. Nobody must be forced to remember exceptions from currently established rules about special and boundary cases.

Your proposal to allow "missing" break to be a warning - not a compile error - will lead to this kind of exceptional situation everybody must remember then reading somebody else code.

For example by brother - he does not do any computing - but he is working in machinery engineering. Then I've asked him (possibly my question were a little bit biased) about this problem – he told me that he prefers then break exists. He has compared switch with BASIC goto and his first question were “Why it not allowed to continue execution of next statement like in BASIC?”.

So – in summary – some kind of mysterious code savings or readability. What do you select?
# AT said on November 15, 2004 12:59 PM:
P.S> As for a future. If due to some mysterious reasons C# will decide to allow fall through or will disable it (without requering break statement) - then it will be good that we requere break statement currently. It will not cause any problems in future.

Thanks
# Radu Grigore said on November 15, 2004 11:09 PM:
"Radu, you were able to express your opinion once and now started to repeat."

Because, as your last post demonstrates, my position is still not clear.

I like the way C# solved the problem, but I think it has more to do with interaction with other languages (as Eric Gunnerson said) than with some simple logical reason (as JayBaz suggested). If C# would have been designed from scratch in a lab with no concern about programmers comming from various backgrounds then I would have not have liked the chosen solution.

"For example you have provided an example about void g(){} void f(){ return g(); } . Why this does not compile?"

Because I wrote it at work where I don't have a C# compiler to test. I assumed (wrongly it appears) that C# is consistent with C++ :-P (and, btw, I was afraid you'll find a challenge :) ). Again, I'm still not sure that it is clear what I meant with this analogy so I'm tempted to repeat in another form, but I'll refrain because it seems to annoy you.

"Overall - my opinion that we must specify as much as possible inside language and make it easy to read for everybody - not only for computer geeks"

This is where we bitterly disagree. Computer geeks have a natural tendency (like mathematicians) to keep thing as un-ambiguous as possible. This is what makes this field so "sterile" for the majority, who, like it or not, "derive their intelectual pleasure partly from not exactly knowing what they are doing" (Dijkstra). I'm afraid that such an opinion: "make it easy for everybody to read" will mean in practice "let's have a portion of ambiguity to attract more people to programming". Instead, we should keep elegance (= simple and effective) as our ultimate purpose.

"So – in summary – some kind of mysterious code savings or readability. What do you select?"

Ha.. that makes me trust your brother's opinion who doesn't know how to program and to whom the problem was presented in such an impartial manner.


# AT said on November 16, 2004 6:37 AM:
Radu: What do you propose to do with current switch/break statement ?
Can you post a summary of your suggestions ?

P.S> If you will read first page of C# standart - you will find "Source code portability is very important, as is programmer portability, especially for those programmers already familiar with C and C++."

"useless" break statement does add some kind of portability. While useless return; at void() does nothing. Here is a difference.
# Radu Grigore said on November 16, 2004 10:29 AM:
AT: I'll try to summarize my opinion.

What I have understood from Eric's blog entry: We want to design a better language for C++ programers. We looked at "switch" statement. Everybody knows that fallthrough is a nightmare in C++ practice. How can we get rid of it? Well, the simplest incremental change that we can make is to force the programmer at the end of a case to decide: does he want to go to another case or is it done? IOW, we enforce explicit break at the end of the case. That's it: simple change, big impact.

I agree completely with this argument, but...

What if we eliminate C++ programmers from the picture? After all, they are more of an economic concern than a technical one. Does it make sense to ask how "switch" should be designed in such an idealized (no C++) world? I think it does because the resulting "ideal" design should be something to strive for, even if making a big step right now would be a mistake.

So, how would you design "switch" if adaptability of C++ programmers would not be a concern? First we should choose some design objectives. My foremost concern is simplicity: it should be easy to reason about the correctness of the program. This usually means reducing side-effects. Another thing I would like from the design would be to ensure that changes in other parts of the code (like adding an enum value) have as little impact as possible on the switch code. Third, I would like the feature to be efficient: it should allow the programmer to solve real problems.

Ignoring C++ programmers doesn't mean ignoring existing _knowledge_. So, I ask: is there any language construct that is easy to reason about and robust to code changes? The answer turns out to be yes. The design I will present here draws heavily on the pattern matching constructs in functional languages.

If we want to make a simple "switch" then we should simplify the flow control and limit it to the most basic case. The simplest use for a switch is "we have a variable X and if its value is x_i (i=1.. n) we want to execute code piece c_i". That is it! No fallthrough _and_ no goto. They only make the code harder to understand. IF the programmer wants to to something more complicated, like reusing code c_2 from inside c_1 (i.e. a form of "goto") he would have to be really explicit about it: extract the code in a function and call it. It even has the added benefit that all side effects are made explicit thru function parameters and return value. So I am not at all against the "be explicit" principle. But you should be explicit about complicated things, not about trivial things.

This design has the advantage that the order of the cases in the program text is irrelevant. To give just a taste of how it would look like:

switch (digit)
{
0: { text = "zero"; }
1: { text = "one"; }
2: { text = "two"; }
// etc...
}

That's it. No break. No goto. No default.

There is a problem with this simplistic approach. The first requirement was to make it easy to reason about program correctness. It turns out that in practice, more often than not, you are interested in covering the whole domain of a variable. This is why the default keyword is actually useful. Notice that the order of the cases is still irrelevant with this change. The only restriction is: have a single "default".

And now we reach the third requirement: make it useful for the programmer. Is the above construct useful? I would answer "somehow". We can do much better. So I propose two enhancements.

The first one is to support regular expressions. At least in the kind of programming I do, language built-in text manipulation facilities would be _great_. Here is just the most basic usage of such a facility.

switch (name)
{
`(\w)+ +(\w)+`: {first_name = \1; last_name = \2; }
`(\w)+ +.\. +(\w)+`: {first_name = \1; last_name = \2; }
default: throw new BadNameFormat();
}

(note1: my regular expressions are jedit-style)
(note2: I have used back-quotes to indicate the compiler that the content is a regular expression, not a simple string)

Yes, I would love to be able to do this. But, without saying a word, I have drifted from the original simple definition: a one-to-one mapping of values to code. The problem is that a regular expression specifies a family of strings, not a single string.

But why not extend "switch"? Instead of mapping single values to code we (as AT suggested) can let "switch" define a partition of the type (viewed as a set) and map each subdomain to a piece of code. Why not write:

switch (temperature)
{
1.. 10: { result = "fine"; }
default: { result = "not good"; }
}

or

switch (comp_result)
{
LESS, EQUAL: { result = "fine"; }
GREATER: { result = "not good"; }
}

While we are at it, why settle for simple enumeration to specify subdomains and not use predicates (like in mathematical "set comprehensions") to specify subdomains? (see also the "when" keyword in OCaml). Etc, etc.

But let's stop here before the design becomes too hairy and analyze what we have now. We notice that allowing domains introduces at least two problems. One of them is sensitivity to enum definition changes. But this is trivial: just don't allow enums to be treated like integers in switch statements. The other one is more serious. We have seen three methods of specifying subdomains: enumeration of values (and the ".." shorthand for integers), regular expressions and predicates. How on earth can we be sure that they define a partition or, to be more specific, that they define disjunct subdomains? It should be feasible to do at compile time for enumerations. It should be possible but hard to do for regular expressions. It would be impossible for predicates.

So what to do? One option is to use the OCaml solution: use the (text) order of the cases. I don't particularily like that cases can't be freely shuffled anymore but it seems to be a small price to pay for the flexibility it brings to the programmer. So, each domain specification will mean "the value satisfies this condition and none of the previous ones". In this scenario it makes sense to force "default" to be the last case.

So here are my suggestions:
1. Make "switch" work like this: its cases partition a type's set of values; "switch" chooses the subdomain the variable belongs and executes corresponding code. No fallthrough. No goto.

2. Provide powerful ways of specifying value domains, e.g. "0.. 23", regular expression, "LESS, EQUAL".

3. Remove any extra unnecessary stuff like "break" and "case".

But do the changes in small steps. Habit is a nasty progress stopper. Most C++ programmers would probably hate what I just wrote simply because it is so different from what they are used to.

I hope this clears up a bit my position. There are still some areas I have touched in previous posts that are not covered by this "explanation".
# AT said on November 16, 2004 5:42 PM:
"What if we eliminate C++ programmers from the picture?" I've already posted a reference to C# standart that C++ programmers were taken in considerations and it was one of first design goals.
If forget about C++ compatibility - I preffer Microsoft invested in functional languages.

As for text-order selections - non-sence. This will result in switch reduced to lineriar sequence of "if/else" statements - and this will render "switch" statement useless.
Changing order of statements must not result in logic change. You can possibly allow some cases to be selected randomly in this situation (just like BinarySearch does on non-sorted items).

Original idea of switch must be performance - O(1) or O(log N) (where N is fictional number of if/else statements used).
Readability still apply - but I put it on second place.

Switch on regex is a cool idea.
Actualy this is possible to compile a regex as composite of possibly statements - and this can provide performance benefits. Instead of executing RegexRunner for each statement - it will be executed only once.

Thanks for your long reply.
Why you did not post it in your blog ?
# Radu Grigore said on November 16, 2004 10:38 PM:
"As for text-order selections - non-sence. [...] and this will render "switch" statement useless."

Not so. Pattern matching in functional languages works this way and is very much used. But I agree that it would be nice to have a better solution: "Changing order of statements must not result in logic change".

"Why you did not post it in your blog ?"

I had to choose between being suspected of poluting jaybaz's blog and trying to steer attention to my own site. I chose the first option as less intrusive :)
# jaybaz [MS] said on November 17, 2004 8:55 AM:
Pollute, pollute!

:-)
# Kannan Goundan said on December 7, 2004 2:28 PM:
“Why it not allowed to continue execution of next statement like in BASIC?”.

This doesn't seem like a serious problem. At the end of a "for" loop or function, execution does not continue to the next statement. If you had:

if (x > 1)
print "3"
else
print "4"

Would you wonder why, after printing "3", execution does not continue and print "4"?

Verbosity doesn't always correlate to increased readability. The next "case" entry should be enough of a visual que to signal to the reader that execution doesn't continue normally. Currently, that visual cue is unnecessarily fat ("break" + next "case").

On the topic of "ReplaceTypeCode"...I don't think OO-style virtual method polymorphism is always the best solution. Sometimes using a "switch" is the best way to do things. The problem is that you should only do this on closed classes. C# lacks real closed class support (yes, it has enums, but they are just a degenerate case of discriminated unions).
New Comments to this post are disabled

This Blog

Syndication

Page view tracker