Representation and Identity

Representation and Identity

Rate This
  • Comments 26

(Note: not to be confused with Inheritance and Representation.)

I get a fair number of questions about the C# cast operator. The most frequent question I get is:

short sss = 123;
object ooo = sss;            // Box the short.
int iii = (int) sss;         // Perfectly legal.
int jjj = (int) (short) ooo; // Perfectly legal
int kkk = (int) ooo;         // Invalid cast exception?! Why?

Why? Because a boxed T can only be unboxed to T. (*) Once it is unboxed, it’s just a value that can be cast as usual, so the double cast works just fine.

Many people find this restriction grating; they expect to be able to cast a boxed thing to anything that the unboxed thing could have been cast to. There are ways to do that, as we’ll see, but there are good reasons why the cast operator does what it does.

To understand why this design works this way it’s necessary to first wrap your head around the contradiction that is the cast operator. There are two (¤) basic usages of the cast operator in C#:

  • My code has an expression of type B, but I happen to have more information than the compiler does. I claim to know for certain that at runtime, this object of type B will actually always be of derived type D. I will inform the compiler of this claim by inserting a cast to D on the expression. Since the compiler probably cannot verify my claim, the compiler might ensure its veracity by inserting a run-time check at the point where I make the claim. If my claim turns out to be inaccurate, the CLR will throw an exception.
  • I have an expression of some type T which I know for certain is not of type U. However, I have a well-known way of associating some or all values of T with an “equivalent” value of U. I will instruct the compiler to generate code that implements this operation by inserting a cast to U. (And if at runtime there turns out to be no equivalent value of U for the particular T I’ve got, again we throw an exception.)

The attentive reader will have noticed that these are opposites. A neat trick, to have an operator which means two contradictory things, don’t you think?

This dichotomy motivates yet another classification scheme for conversions (†).  We can divide conversions into representation-preserving conversions (B to D) and representation-changing conversions (T to U). (‡) We can think of representation-preserving conversions on reference types as those conversions which preserve the identity of the object. When you cast a B to a D, you’re not doing anything to the existing object; you’re merely verifying that it is actually the type you say it is, and moving on. The identity of the object and the bits which represent the reference stay the same. But when you cast an int to a double, the resulting bits are very different.

All the built-in reference conversions are identity-preserving (£). Obviously trivial “conversions” such as converting from int to int are also representation-preserving conversions. All user-defined conversions (§) and non-trivial value type conversions (such as converting from int to double) are representation-changing conversions. Boxing and unboxing conversions are all representation-changing conversions.

The representation-preserving conversions that are known to never fail often result in no codegen at all (₪). If a representation-preserving conversion could fail then a castclass instruction is emitted, which does a runtime check and throws if the check fails.

But each representation-changing conversion is handled in its own special way. User-defined conversions are resolved using a special version of the overload resolution algorithm, and generated as a call to the appropriate static method. Boxing and unboxing conversions are generated as box and unbox instructions. All the other built-in conversions (int to double, and so on) are generated as custom sequences of instructions that do the right conversion.

So now that you know that, consider what the compiler would have to do to make this work the way some people expect:

int kkk = (int) ooo;

All that the compiler knows is that ooo is of type object. It could be anything. Suppose it is a boxed int – then the compiler should generate an unboxing instruction. Suppose it is a boxed short. Then the compiler should unbox the short and then generate the custom sequence of instructions that convert a short to an int. Suppose it is a boxed double – same thing, but different instructions. And so on, for all the built-in conversions that go to integer.

This would be a huge amount of code to generate, and it would be very slow. The code is of course so large that you would want to put it in its own method and just generate a call to it. Rather than do that by default, and always generate code that is slow, large and fragile, instead we’ve decided that unboxing can only unbox to the exact type. If you want to call the slow method that does all that goo, it’s available – you can always call Convert.ToInt32, which does all that analysis at runtime for you. We give you the choice between “fast and precise” or “slow and lax”, and the sensible default is the former. If you want the latter then call the method.

That’s just the built-in conversions. Let’s continue imagining what would have to happen if we wanted all possible conversions to int to just work out correctly at runtime, instead of just bailing out early if the boxed thing is not an int.

Suppose the object is a Foo where there is a user-defined conversion from Foo (or one of its base classes) to int (or a type that int is explicitly convertible from, like, say, Nullable<int>). Then the compiler would need to generate a call to that conversion method, just as it would if the type had been known at compile time, and then possibly also generate the conversion from the return type of the method to int.

Remember, there could be arbitrarily many such conversion methods on arbitrarily many types. The type Foo and its conversion method might not even be defined in the assembly currently being compiled or any assembly referenced. Therefore the compiler would have to generate code to interrogate Foo at runtime, do the overload resolution analysis, and then dynamically spit the code to do the call.

Which is exactly what the compiler does in C# 4.0 if the argument to the cast is of type “dynamic” instead of object. The compiler actually generates code which starts a mini version of the compiler up again at runtime, does all that analysis, and spits fresh code. This is not fast, but it is accurate, if that’s what you really need. (And the spit code is then cached so that the next time this call site is hit, it is much faster.)

I don’t think people really expect the compiler to start up again at runtime every time they cast an object to int; I think they just haven’t thought through carefully exactly how much analysis solving the problem would take. Rather a lot, it turns out.

*************

(*) Or Nullable<T>.

(¤) There are others that are not germane to this discussion. For example, a third usage is “Everyone knows that this D is also of base type B; I want the compiler to treat this expression of type D as a B for overload resolution purposes.” That would clearly be an identity-preserving conversion.

(†) There are many ways to classify conversions; we already divide conversions into implicit/explicit, built-in/user-defined, and so on. For the purposes of this discussion we’ll gloss over the details of those other classifications.

(‡) I’m glossing over here that certain conversions that the C# compiler thinks of as representation-changing are actually seen by the CLR verifier as representation-preserving. For example, the conversion from int to uint is seen by the CLR as representation-preserving because the 32 bits of a signed integer can be reinterpreted as an unsigned integer without changing the bits. These cases can be subtle and complex, and often have an impact on covariance-related issues; see next footnote.

I’m also ignoring conversions involving generic type parameters which are not known at compile time to be reference or value types. There are special rules for classifying those which would be major digressions to get into.

(£) This is why covariant and contravariant conversions of interface and delegate types require that all varying type arguments be of reference types. To ensure that a variant reference conversion is always identity-preserving, all of the conversions involving type arguments must also be identity-preserving. The easiest way to ensure that all the non-trivial conversions on type arguments are identity-preserving is to restrict them to be reference conversions.

(§) The rules of C# prohibit all user-defined conversions that could possibly be identity-preserving coercions. More generally, all user-defined conversions that could possibly be any "standard" conversion are illegal.

(₪) Again, I’m ignoring irksome generic issues here. There are situations where humans can prove mathematically that two generic type parameters must be identical at runtime, but the verifier is not smart enough to make those same deductions and requires the compiler to emit type checks.

  • Just what I was wondering about last week. As always - clear, precise and informative post. Thank you!

  • Very nice article. I have been waiting for someone to "deal" with this issue a very long time.

    Additionaly the only difficulties with this is in generics classes - best way to address this problem is technique like this:

       public static class DynamicConverter<TFrom, TTo>

       {

           private static Func<TFrom, TTo> converter = CreateExpression<TFrom, TTo>(body => Expression.Convert(body, typeof(TTo)));

           public static TTo Convert(TFrom valueToConvert)

           {

               return converter(valueToConvert);

           }

           public static Func<TFrom, TTo> Converter

           {

               get { return converter; }

           }

           private static Func<TArg1, TResult> CreateExpression<TArg1, TResult>(

               Func<Expression, UnaryExpression> body)

           {

               ParameterExpression inp = Expression.Parameter(typeof(TArg1), "inp");

               try

               {

                   return Expression.Lambda<Func<TArg1, TResult>>(body(inp), inp).Compile();

               }

               catch (Exception ex)

               {

                   string msg = ex.Message; // avoid capture of ex itself

                   return delegate { throw new InvalidOperationException(msg); };

               }

           }

       }

       public static class DynamicConverter

       {

           public static TTo Convert<TFrom, TTo>(TFrom valueToConvert)

           {

               return DynamicConverter<TFrom, TTo>.Convert(valueToConvert);

           }

       }

    than you can do something like:

    var converted = DynamicConverter.Convert<sometype, T>(source);

    in you generic class if you know than this conversation from sometype to T exists.

  • You've run out of foot note characters here's a few more ‖, ¶. :)

    Lovely reasoning, I've often been annoyed at the unboxing convention so it's nice to see a rationale why it doesn't do it.

  • Great. Now can you please explain to the people I used to work with that unboxing an int by using Convert.ToInt32(obj) is bad for their health?

  • Great post, Eric, as always.  Very informative and thought-provoking.  And I think this one must get the record for the most footnotes ever used in a blog post!   :-)

  • Interesting post. Do you know what the reasoning was behind the unbox instruction throwing InvalidCastException rather than having another exception type specifically for unboxing failures?

  • > Do you know what the reasoning was behind the unbox instruction throwing InvalidCastException rather than having another exception type specifically for unboxing failures?

    Why not? Semantically, unboxing is a downcast ("this is B which I know is a D - give me that D" - "this is Object which I know is an Int32 - give me that Int32"). It's not identity-preserving simply because value types do not have inherent identity, but otherwise I consider it the same thing, so it makes sense to me that the same exception is used to indicate failure in both cases.

    Of course, it is also possible to set the issue of identity preservation aside entirely, and just say that _all_ conversions deal with representations, and Base->Derived cast is also a conversion that translates a value of type "reference to Base" to _another_ value of type "reference to Derived" (the fact that the bit pattern may remain the same as a result is irrelevant). If you look at it that way, identity does not even enter into the question, because the value converted - which is the reference, not the object - does not have any identity. Also, from this POV, it makes more sense to have a single cast/convert operator.

  • I had a need for calling the conversion operators on a boxed value back in .net 2 so I wrote a basic dynamic cast method similar to what holatom is suggesting.  It's here:

    http://codegoeshere.blogspot.com/2007/05/dynamic-cast-in-c.html

    This was a very targeted scenario, though, and I don't think I've needed it since.

  • VB's CType operator caters to those who would rather not think about the different kinds of casting by generating the smallest amount of conversion code possible given the type information known at compile time. For example, CType(1, Integer) will actually not generate a cast at all, since the compiler knows it is unnecessary. CType(myInt, Long) will generate a conversion operation, the same as (long)myInt; while CType(myStream, MemoryStream) will generate a preserving conversion ("castclass" IL instruction), the same as (MemoryStream)myStream. Additionally, CType(myBoxedInt, Long) will generate a call to a VB helper function which will eventually cast the boxed int to IConvertible, and use that interface to convert to an Int64; this essentially performs the same function as (long)(int)myBoxedInt, although there is a performance cost. But, CType will still give you a compile time error if you try a conversion which is not possible under any circumstances (say, CType(myBoxedInt,

    I am not generally in favor of CType; I prefer explicitly stating the kind of conversion that I know is necessary (which is probably why, or indicative of why, I prefer C# over VB). But the existence of CType is interesting to me because it dramatically demonstrates the difference in philosophies between the languages. C# forces you to prove that you know what you are doing (and in the process often forces you to think through what you are doing, and maybe do it better). VB allows you to say, "I don't care how you do it, don't bother me with the details, just get it done." Of course, if it CAN'T be done, you still get an exception, which you may have been able to find at compile time if you were forced to think about it.

    Indeed, this does highlight an important difference in design philosophies. I like to say that VB is a "do what I mean" language, C# is a "do what I say" language. -- Eric

     

  • > VB allows you to say, "I don't care how you do it, don't bother me with the details, just get it done."

    From what Eric says, it seems that (T)(dynamic)x would do just that in C# 4.0.

  • Thank you for submitting this cool story - Trackback from DotNetShoutout

  • As I understand it, the issue is that the actual type of the object is unknown until runtime, so the compiler can't possibly know how to convert an unknown type to the type specified in the program.

    Would a virtual method defined on object, say T Cast<T>(), solve the problem? That way, the compiler can issue an unbox, emit the virtual call to the unknown type and let it deal with problem if it wants to?

    Sure, if we had generics in .NET v1 that would have worked. Basically you would be putting the onus on the developer of the type to provide conversions to arbitrary types. I like that in theory; I like thinking of a type as "a set of values associated with a set of conversion rules". Such a system encapsulates that concept nicely. But that ship has sailed; we did not have generics in v1 and we're not going to add new virtual methods to object now. -- Eric

  • pminaev said:

    "> VB allows you to say, "I don't care how you do it, don't bother me with the details, just get it done."

    From what Eric says, it seems that (T)(dynamic)x would do just that in C# 4.0."

    Not exactly. VB uses CType as a universal conversion operator, which will generate the most efficient form of conversion possible wherever it is used. You don't have to think about which type of conversion you are doing in different circumstances, because you are using the same operator. In the worst case, where no type information is known (i.e. casting from Object), the least efficient form of conversion is used (which basically tries each kind of conversion in turn). Casting to dynamic in C# 4.0 essentially removes all type information, forcing you into the least efficient conversion, even if a more efficient conversion could have been performed had the type information been preserved (such as in the (int)(long)myBoxedLong case). The dynamic version is slightly better though, since it caches the results of the conversion overload resolution.

  • > Casting to dynamic in C# 4.0 essentially removes all type information, forcing you into the least efficient conversion, even if a more efficient conversion could have been performed had the type information been preserved

    I don't see any reason why the compiler wouldn't be able to optimize the case of casting a value of type definitely known at compile-time to dynamic, as in my example - at least in theory. I doubt that the actual implementation in .NET 4 will do that - it seems to require too much effort for very dubious value - but it is certainly possible to optimize it in precisely the same way as CType does (since, after all, it has all the same inputs!).

    Indeed. Our philosophy for "dynamic" is "you said dynamic, so you meant dynamic, so you'll get dynamic." We have certainly considered doing compiler work to detect situations where we know at compile time that a dynamic call cannot possibly succeed, or detect situations where the compiler can deduce enough type information about the dynamic thing to skip making a dynamic call. As you correctly call out, this is an immense amount of work for bizarre corner cases that directly contradict the stated intention of the programmer -- if the programmer says they want dynamic dispatch that might fail at runtime, that's what they'll get. -- Eric

     

  • > As you correctly call out, this is an immense amount of work for bizarre corner cases that directly contradict the stated intention of the programmer -- if the programmer says they want dynamic dispatch that might fail at runtime, that's what they'll get.

    Will that be a hard requirement in the language spec, however? I.e. will the conforming compiler be required to defer the error until execution even if it can see that it is going to fail at compile time already? Or is it implementation-defined?

    I think it doesn't touch the original case we discussed either way, though, as it wasn't about an error case - it was merely about optimization for the successful case. Something like (int)(dynamic)1.0 cannot fail, and if the compiler is smart enough to handle that at compile-time, the "as if" rule should kick in.

Page 1 of 2 (26 items) 12