Chained user-defined explicit conversions in C#

Chained user-defined explicit conversions in C#

  • Comments 8

Reader Niall asked me why the following code compiles but produces an exception at runtime:

class Base {}
class Derived : Base {}
class Castable {
  public static explicit operator Base() {
    return new Base();
  }
}
// ...
Derived d = (Derived)(new Castable());

It should be clear why this produces an exception at runtime; the user-defined operator returns a Base and that is not assignable to a variable of type Derived. But why does the compiler allow it in the first place?

First off, let’s define the difference between an implicit and an explicit conversion. An implicit conversion is one which the compiler knows can always be done without incurring the risk of a runtime exception. When you have a method int Foo(int i){...} and call it with long l = Foo(myshort);, the compiler inserts implicit conversions from int to long on the return side and from short to int on the call side. There is no int which doesn’t fit into a long and there is no short which doesn’t fit into an int, so we know that the conversions will always succeed, so we just up and do them for you.

There are also conversions which we know at compile time will never succeed. If there is no user-defined conversion from Giraffe to int, then Foo(new Giraffe()) is always going to fail at runtime, so this fails at compile time.

An explicit conversion is a conversion which might succeed sometimes but might also fail. We cannot disallow it, because it might succeed, but we can’t go silently inserting one either, since it might fail unexpectedly. We need to force the developer to acknowledge that risk explicitly. If you called ulong ul = Foo(mynullableint); then that might fail, so the compiler requires you to spell out that the conversions are explicit. The assignment could be written ulong ul = (ulong)Foo((int)mynullableint);.

There are two times that the compiler will insert an explicit cast for you without producing a warning or error. The first is the case above. When a user-defined explicit cast requires an explicit conversion on either the call side or the return side, the compiler will insert the explicit conversions as needed. The compiler figures that if the developer put the explicit cast in the code in the first place then the developer knew what they were doing and took the risk that any of the conversions might fail. That’s what the cast means: this conversion might fail, I will deal with it.

I understand that this puts a burden upon the developer to fully understand the implications of a cast, but the alternative is to make you spell it out even further, and it just gets to be too much. The logical extreme of this would be a case such as

public struct S{
    public static explicit operator decimal?(S s) {return 1.0m;}
}
//...
S? s = new S();
int i = (int) s;

Here we do first an explicit conversion from S? to S, then a user-defined explicit conversion from S to decimal?, then an explicit conversion from decimal? to decimal, and then an explicit conversion from decimal to int. That’s four explicit conversions for the price of one cast, which I think is pretty good value for your money.

I want to note at this point that this is as long as the chain gets. A user-defined conversion can have built-in conversions inserted automatically on the call and return sides, but we never automatically insert other user-defined conversions. We never say that there’s a user-defined conversion from Alpha to Bravo, and a user-defined conversion from Bravo to Charlie, and therefore casting an Alpha to a Charlie is legal. That doesn’t fly.

And a built-in conversion can be lifted to nullable, which may introduce additional conversions as in the case above. But again, these are never user-defined.

The second is in the foreach loop. If you have foreach(Giraffe g in myAnimals) then we generate code which fetches each member of the collection and does an “explicit” conversion to Giraffe. If there happens to be a Snail or a WaterBuffalo in myAnimals, that’s a runtime exception. I considered adding a warning to the compiler for this case to say hey, be aware that your collection is of a more general type, this could fail at runtime. It turns out that there are so many programs which use this programming style, and so many of them have “compile with warnings as errors” turned on in their builds, that this would be a huge breaking change. So we opted to not do it.

  • With samples as above, why does the compiler know that

    int j = Foo((int)(new Base()));

    can never work, but doesn't know that

    Dervied d = (Derived)(new Base());

    also can never work? (the former won't compile; the latter will, and throw)

  • Good question.  I'll post an answer later this week!

  • Seem that automatic convertions to nullable are surprising a lot of people :

    * http://tirania.org/blog/archive/2007/Mar-13-1.html

    * http://blogs.msdn.com/abhinaba/archive/2005/12/11/501544.aspx

    Anyway, always a nice blog.

  • Thanks for the tour of casting's curious corners.

    On the subject of implicit casts, is it possible to implicitly cast an enum value to its underlying type? I read through a bunch of the .NET docs, and it doesn't seem like it.

    I'm using a bunch of integer command codes, and I wanted to use an enum to define the common code values. Unfortunately, that means I have to explicitly cast them to int whenever I use them. What I really want is an enum that behaves like a bunch of const declarations with automatic numbering.

    Perhaps my real question is, "What's the difference between an enum and a class with a bunch of const declarations?"

  • An enum may be explicitly cast to its underlying integral type, but to my knowledge there is no way to do so implicitly.

    As far as the underlying implementation goes, an enumerated type is basically just a shorthand for a sealed class which extends System.Enum and has public static constant fields.  The additional restrictions placed on enumerated types in the C# type system are by-design artefacts of the C# language, not anything fundamental in the runtime.   C# enums just give you a way to keep integers which are _logically_ different types from being mixed accidentally.

  • I don't like those explicit casting rules, FWIW. Not that they're going to change because of me anyway ;)

    To me it's entirely confusing why a type that's explicitly convertible to decimal? should automatically be explicitly convertible to int in one step. If it were *implicitly* convertible to decimal? I'd be a little more convinced, but I still think more casts are needed there. In your example I think you should need at least (int) (decimal) s, and that's even IF the conversion to decimal? were implicit.

    Not only are the rules confusing when they *do* apply, they don't apply in cases where something like them might truly be useful. For example, I had my own Nint class before Nullable<> came along and had the implicit/explicit conversions from and to int respectively. But any time I put one of my Nints into an object variable (or, say, ViewState) and wanted to get it back as an int, I had to type (int) (Nint) o. I realize that the compiler had no particular way to figure out that I wanted my class to get involved in what looks like a straight cast from a base type (object) to derived (int), but if you're going to do magic 4-step chained explicit conversions for me when I don't want them, then I feel entitled to demand that you should do some magic (eg based on the *runtime* type) when I *do* want it, too ;)

    Of course, that's just my opinion, I could be wrong.

  • Reader Larry Lard asks a follow-up question regarding the subject of Monday’s blog entry . Why is it

  • Visual Studio Orcas Beta 1 is available for download . Though quite similar to the March CTP in terms

Page 1 of 1 (8 items)