Fabulous Adventures In Coding
Eric Lippert is a principal developer on the C# compiler team. Learn more about Eric.
Most people will tell you that the difference between "(Alpha) bravo" and "bravo as Alpha" is that the former throws an exception if the conversion fails, whereas the latter returns null. Though this is correct, and this is the most obvious difference, it's not the only difference. There are pitfalls to watch out for here.
First off, since the result of the "as" operator can be null, the resulting type has to be one that takes a null value: either a reference type or a nullable value type. You cannot do "as int", that doesn't make any sense. If the argument isn't an int, then what should the return value be? The type of the "as" expression is always the named type so it needs to be a type that can take a null.
Second, the cast operator, as I've discussed before, is a strange beast. It means two contradictory things: "check to see if this object really is of this type, throw if it is not" and "this object is not of the given type; find me an equivalent value that belongs to the given type". The latter meaning of the cast operator is not shared by the "as" operator. If you say
short s = (short)123;int? i = s as int?;
then you're out of luck. The "as" operator will not make the representation-changing conversions from short to nullable int like the cast operator would. Similarly, if you have class Alpha and unrelated class Bravo, with a user-defined conversion from Bravo to Alpha, then "(Alpha) bravo" will run the user-defined conversion, but "bravo as Alpha" will not. The "as" operator only considers reference, boxing and unboxing conversions.
And finally, of course the use cases of the two operators are superficially similar, but semantically quite different. A cast communicates to the reader "I am certain that this conversion is legal and I am willing to take a runtime exception if I'm wrong". The "as" operator communicates "I don't know if this conversion is legal or not; we're going to give it a try and see how it goes".
Mark: Polymorphism is fine, but I didn't create the class hierarchy. Even if I subclass everything that I might need a value from, I can't get the rest of the world to use my subclasses; a DataGrid will always use a TextBox and there's no way to tell it to use my subclass; I have to interoperate with code that is already written, libraries I don't maintain. I'm sure you get the idea. Extension methods would be nice, but that only works if you know the types at compile-time.
Pavel, compare these two options:
var formvalues = from e in form.Children select (e is CheckBox) ? (e as CheckBox).IsChecked : (e is ListBox) ? (e as ListBox).SelectedItem : (e is TextBox) ? (e as TextBox).Text : null;
var formvalues = from e in form.Children select new Select<object>(e).Case<CheckBox>(x => x.IsChecked).Case<ListBox>(x => x.SelectedItem).Case<TextBox>(x => x.Text).Default(null);
Which one better expresses the intent of the code? Which one is easier to read? Which is easier to understand? I think the first one wins on all counts. In particular, the Select class is something I would have to write, which means it's 100+ lines of generic metaprogramming code that has to be maintained by my organization. Since the Select<>() idiom is not standard, you really have to study it to figure out what it does.
And the whole reason we don't like the first option is the slight perf hit of having the redundant "is", so it doesn't make sense to replace the extra "is" with a method call, a delegate call, and all the other logic inside the Case functions. Not to mention that the first option stops executing the logic when it finds a matching type, while the second option still has to make a function call for each Case regardless of whether it does anything, plus it has the overhead of constructing a new Select instance for every child of the form.
I'm probably a little late to be posting another reply here, but I might as well try to clarify something.
I've come across a number of instances where double dispatch would be inappropriate, inadvisable, or just plain impossible. For example:
- Database entities. Unless you can show me an entity framework that does a competent job of handling multi-table hierarchies and supports OO concepts like abstract base classes and double dispatch, the result is often a mess of conditionals to account for the impedance mismatch between domain class hierarchies and discrete entities. I love Linq to SQL, but it's not up to that particular task, and neither has any other ORM that I've tried.
- UI functions. In any medium-to-large scale app you are very likely to have some kind of middle tier that contains "business logic" but is UI independent (some consumers may not even *have* a UI). Without some crazy code-injection technique like AOP or virtual extension methods, it's somewhere between impractical and impossible to make the class capable of instantiating its own editor. Let's say you're representing several types of these (related) objects in a tree or list view and the user has initiated an edit - how do you know what editor to bring up? Again, I think you're pretty much stuck with the mess of conditionals; you might have access to a discriminator and be able to use a switch statement, but that's only making the code nicer, not solving the design problem we've described.
- UI control hierarchies. It's a *very* frequent occurrence that some particular Winforms control re-implements some non-virtual property as "new", and if you fall into the trap of always accessing it through the base class then you're in big trouble. I don't know about anyone else, but in my experience, the "Text" property is notorious for behaving strangely. This may be poor design, but it's also someone else's design, there's nothing I can do about it.
- Messaging. You're communicating with some hardware device that may have sent you any of a dozen or maybe a hundred different messages, none of which have anything important in common (except for being "messages"). You have to respond to the message, but first, you obviously have to figure out what it is. Similar to above, if you have control over the message class hierarchy, you can add a discriminator so your handling code can be a nicer switch statement, but it still suffers from the same problems described in the earlier reply (the contract says that it can handle any message, even though maybe it can't, and the author may indeed have no idea at all what "HandleMessage" means for any given message).
Maybe I just haven't been out there long enough to know the "right" solutions for these, but in my limited experience, the only truly elegant solution is dynamic programming, and I assume that's why the C# team is working on it for C# 4. (And in a sense, dynamic typing doesn't really make the design problem go away, it just provides a more convenient way of expressing the workaround).
I also like the typeswitch that Pavel mentions but, of course, that is just syntactic sugar, it's not any more efficient or safe than the is/as ladder and doesn't make the underlying design any better.
@Aaron G (and any other interested reader). If you have been folowing the information available for 4.0, the primary (but not sole) reason given for dynamic is for interop with other languages, and NOT for functionallity directly contained in C# or other strongly typed languages.
In my professional opinion, considering AOP, code injection or IL rewriting as "some crazy technique" is at best short sighted. Consider that something as fundamental as code contracts is impleented in exactly this fashon.
I have successfully used "synthesized double dispatch" (ie multiple calls occur, but NO conditional branching) in all 4 of the categories you mention. This is NOT to say that such an approach is always a good choice, but it is one that can be considered.
I can be contacted as david "dot" corbin "at" dynconcepts "dot" com by anyone interested in these approaches.
Do you see Gabe's code fragments? Do yo see what kind of monster are you creating?
"var formvalues = from e in form.Children select new Select<object>(e).Case<CheckBox>(x => x.IsChecked).Case<ListBox>(x => x.SelectedItem).Case<TextBox>(x => x.Text).Default(null);"
Ouch. How this will interfere with dynamic?
I'm new to C# and have been reading your blogs. They are very nice and informative. Thanks for them.
I may be wrong but please clear my doubt.
I have a question here:
You have written in this blog:
["....The "as" operator will not make the representation-changing conversions from short to nullable int like the cast operator would....The "as" operator only considers reference, boxing and unboxing conversions."]
This means that boxing and unboxing are not representation-changing conversions and so "as" handles them.
Now, in another blog blogs.msdn.com/.../representation-and-identity.aspx you have mentioned that
[".... Boxing and unboxing conversions are all representation-changing conversions...".]
I find the two statements (in brackets above) from two different blogs contradicting each other. Kindly explain which one is correct?.. Please correct me if I've misunderstood them.
It's all pretty good, but you seem inclined to use as instead of a cast, so this isn't completly unbiased.
The better option (at least for me) would be to use a method (if you use a try-catch or try-finally block, that surely won't fail).