When is a cast not a cast?

When is a cast not a cast?

Rate This
  • Comments 26

I'm asked a lot of questions about conversion logic in C#, which is not that surprising. Conversions are common, and the rules are pretty complicated. Here's some code I was asked about recently; I've stripped it down to its essence for clarity:

class C<T> {}
class D
{
  public static C<U> M<U>(C<bool> c)
  {
    return something;
  }
}
public static class X
{
  public static V Cast<V>(object obj) { return (V)obj; }
}

where there are three possible texts for "something":

Version 1: (C<U>)c
Version 2: X.Cast<C<U>>(c);
Version 3: (C<U>)(object)c

Version 1 fails at compile time. Versions 2 and 3 succeed at compile time, and then fail at runtime if U is not bool.

Question: Why does the first version fail at compile time?

Because the compiler knows that the only way this conversion could possibly succeed is if U is bool, but U can be anything! The compiler assumes that most of the time U is not going to be constructed with bool, and therefore this code is almost certainly an error, and the compiler is bringing that fact to your attention.

Question: Then why does the second version succeed at compile time?

Because the compiler has no idea that a method named X.Cast<V> is going to perform a cast to V! All the compiler sees is a call to a method that takes an object, and you've given it an object, so the compiler's work is done. The method is a "black box" from the caller's perspective; the compiler does not look inside that box to see whether the mechanisms in that box are likely to fail given the input. This "cast" is not really a cast from the compiler's perspective, it's a method call.

Question: So what about the third version? Why does it not fail like the first version?

This one is actually the same thing as the second version; all we've done is inlined the call to X.Cast<V>, including the intermediate conversion to object! That conversion is relevant.

Question: In both the second and third cases, the conversion succeeds at compile time because there is a conversion to object in the middle?

That's right. The rule is: if there is a conversion from a type S to object, then there is an explicit conversion from object to S. (*)

By making a conversion to object before doing the "offensive" conversion, you are basically telling the compiler "please throw away the compile-time information you have about the type of the thing I am converting". In the third version we do so explicitly; in the second version we do so sneakily, by making an implicit conversion to object when the argument is converted to the parameter type.

Question: So this explains why compile-time type checking doesn't seem to work quite right on LINQ expressions?

Yes! You would think that the compiler would disallow nonsense like:

from bool b in new int[] { 123, 345 } select b.ToString();

because obviously there is no conversion from int to bool, so how can range variable b take on the values in the array? Nevertheless, this succeeds because the compiler translates this to

(new int[] { 123, 345 }).Cast<bool>().Select(b=>b.ToString())

and the compiler has no idea that passing a sequence of integers to the extension method Cast<bool> is going to fail at runtime. That method is a black box. You and I know that it is going to perform a cast, and that the cast is going to fail, but the compiler does not know that.

And maybe we do not actually know it either; perhaps we are using some library other than the default LINQ-to-objects query provider that does know how to make conversions between types that the C# language would not normally allow. This is actually an extensibility feature masquerading as a compiler deficiency: it's not a bug, it's a feature!


(*) You'll note that I did not say "there is an explicit conversion from object to every type", because there isn't. Can you think of a type S that cannot be converted to object?

  • @Diego F.

    >> You should not declare the type when using query comprehension unless you really need it.

    Seems to me it should be the other way. That's one of the many reasons I use fluent and don't use var. I even declare the enumerable type, so that I can keep check whether I'm losing types - for example if I expect an IQueryable and for some reason (say, funcs instead of expressions) I get just an IEnumerable it gives me an error instead of a subtle performance impact I may never notice until in production.

    Also, it might be me but

    IEnumerable<int> output = input.Where(i => i < 5); // I know exactly what is doing, what delegate is called when and how.

    from var i into output where i < 5 select i; // I have to guess, compiling code in my head to the one above.

  • "Void cannot be converted to object.": Show me even just a single value of type void that you cannot convert to object :)

  • mmx is onto something. :)

    I've never been a fan of the SQL-like syntax in C#. It seems like a lot of effort with very little real value. Perhaps there is some perceived value (people think it is cool until they use it) and some marketing value, but I think the language and the users would have been served better by focusing on the underlying extension-method-based syntax instead of confusing the issue (and unnecessarily lengthening the spec and the compiler codebase) with two ways to do something.

    auto output = input.Where(x => x < 5); // I know exactly what is happening here.

  • Eric's post from a few weeks ago "Foolish consistency is foolish" talks about the Cast<> that's done if you include the type in the SQL-style syntax.  I wasn't aware of that behaviour until I read that article mainly because I just never inserted the type and hadn't thought about it.  I imagine that when Eric said in that article "Discussing why that is might be better left for another day. " he had today's article in mind (I was hoping there would be a follow-up).

    I do agree with those who say the fluent-syntax is better.  Even though I spend half my day in SQL Server writing some pretty complicated queries I still prefer fluent-syntax in C#.  I think for me it's because the SQL-like syntax is just different enough from normal SQL that it's harder for me to write, being so familiar with SQL, so I prefer using .SelectMany(), etc.  Also I've been using a lot of Rx and a few of the common things I've had to do need the fluent syntax anyway.

  • h.v.dijk: the question in the footnote was "Can you think of a type S that cannot be converted to object?". Note that it is about types, not about objects! I think my answer is actually the most obvious one :-)

  • (sorry for the double-post; and it seems my first attempt got eaten).

    About "Show me even just a single value of type void that you cannot convert to object", the answer is, of course, "all of them!" :-)

    given a function declared as "void foo()", the compiler does complain about "object o = foo();" with  "error CS0029:  Cannot implicitly convert type 'void' to 'object'", and if you insert a cast, that changes to "error CS0030: Cannot convert type 'void' to 'object'". In a way, I have a "value" of type void here, by way of the return value from foo().

    What seems somewhat inconsistent is that given a function such as "int bar()", I can write "bar();", which implicitly converts an int to void, but not "(void) bar();". The latter results in "error CS1547: Keyword 'void' cannot be used in this context". The former doesn't even get a warning, while the second is a sort of annotation like "hey, I am ignoring the return value here, but I know what I'm doing".

  • @Rhialto: Discarding the return value of a method doesn't imply a cast to 'void'. In your example, the line 'bar();' is still of type int, but you're just throwing that int away.

  • @Rhialto: object o = foo(); That's a good point, actually. It's only possible to get an expression of type void of which the value is used in erroneous C# programs, but if the question is about types that cannot be converted, of course showing an error message for an attempted conversion is a valid approach. As for the question, I was also thinking about static classes (with basically the same idea as you), but surprisingly, they can be converted. C# disallows their use as generic type arguments, but CIL doesn't, so

    static void Test<T>()

    {

    T x = default(T);

    object o = x;

    }

    is callable, for example using reflection, and works at runtime without any exception, when T is a static class.

  • I'm sorry, this is just a very bad design decision for two reasons:

    (1) Given that c could possibly be a C<U> and you have an explicit down cast that could fail at run-time anyways, regardless of the probability. This is like saying that their is another kind of explicit down cast that is more likely to fail than another, and this is ridiculous.  

    (2) The "feature" adds obscure complexity to the C# language, developers are never really taught to write (C<U>)(object)c, that this would somehow make more sense than (C<U>)c. To new C# developers, if the compiler says "no you can't even do an explicit downcast here", then they think "wow, the compiler is smart, it knows that this will always fail at run-time so is rejecting the code statically." But wait...there is a use case where the down cast could actually work at run time, and so the compiler is actually stupid, and maybe dishonest, and anyways we can hack around it by up casting to object first! This is really on par with mutable foreach variables.

    As a PL guy who uses a lot of type reflection (typeof(T) == typeof(S) stuff), I learned quickly about the (object) hack rather quickly and must use it rather often. I scorn it every time I use it, but at least I know what's going on. And at least this is a common enough problem to be embedded as community knowledge even if MSDN is coy about it.

  • Please disregard my previous comments. Stated in anger and not necessarily aimed at Eric or anyone else in particular at MS. My sincerest apologies for acting like an ass.

  • Don't know if void can be converted to object, but I wish void had a unique instance, more like Unit in Caml or () in Haskell.

    I am furious everytime I have to write a wrapper-style function twice in C#, to support both void and a generic return type.

    If I could just write Foo<void>(); to make Foo return void...

    The unit type is a bettter design, but I guess it was avoided in C# so that C/C++/Java devs would not feel lost.

Page 2 of 2 (26 items) 12