Cast operators do not obey the distributive law

Cast operators do not obey the distributive law

Rate This
  • Comments 23

Another interesting question from StackOverflow. Consider the following unfortunate situation:

object result;
bool isDecimal = GetAmount(out result);
decimal amount = (decimal)(isDecimal ? result : 0);

The developer who wrote this code was quite surprised to discover that it compiles and then throws “invalid cast exception” if the alternative branch is taken.

Anyone see why?

In regular algebra, multiplication is “distributive” over addition. That is q * (r + s) is the same as q * r + q * s. The developer here was probably expecting that casting was distributive over the conditional operator. It is not. This is not the same as

decimal amount = isDecimal ? (decimal)result : (decimal)0;

which is in fact the correct code here. Or, better still:

decimal amount = isDecimal ? (decimal)result : 0.0m;

The problem faced by the compiler is that the type of the conditional expression must be consistent for both branches; the language rules do not allow you to return object on one branch and int on the other.

We choose the best type based on the types we have in the expression itself, not on the basis of types that are outside the expression, like the cast. Therefore the choices are object and int. Every int is convertible to object but not every object is convertible to int, so the compiler chooses object. Therefore this is the same as

decimal amount = (decimal)(isDecimal ? result : (object)0);

And therefore the zero returned is a boxed int. The cast then unboxes the boxed int to decimal. As we’ve already discussed at length, it is illegal to unbox a boxed int to decimal. That throws an invalid cast exception, and there you go.

  • I prefer compiler to throw a compilation error instead of putting type conversion for me conditional operator. You may say that asking devs to put explicit type conversion in all assignments may be overkill. But at least in the case of conditional operator it should have asked me to make my intentions clear.

  • Thank dude. You just made me a little smarter (which suffice for your good deed for the day.) I have noticed this problem too but never understood it until now.

  • This is something that has always bothered me about C#'s casting operators. Boxing is intentionally largely transparent in C#. When you want to cast an object to a value type, however, the boxing is suddenly completely opaque. Nowhere else do we have to worry about whether a cast is a simple type-cast or conversion. I understand why it works this way, but it certainly isn't obvious until it's gotten you a couple of times.

  • VB can handle this, however:

           Dim x As Object = 12.3

           Console.WriteLine(CType(x, Integer)) ' okay - 12

    but, of course, you pay the runtime penalty for all the extra type checks it has to do to make this work.

    Still, I like the overall approach better. Especially the named cast operators with distinct semantics that is reflected in the names. So you wouldn't expect DirectCast to do the above in VB, because, well, it's not "direct" - you have to go from Object to Double, and then from Double to Integer.

    Better yet would be to have altogether separate syntaxes for casting references up/down/across the inheritance hierarchy, for unboxing, and for conversions. They are, after all, different operations with noticeable semantical differences - up/down/cross-casting is identity-preserving while unboxing is not, for example; and data conversions can simply lose some of the data (even widening ones - think int->float). F# fares relatively well there:

    upcast: (x :> T)

    downcast: (x :?> T)

    cross-cast: (x :> obj :?> T), i.e. upcast followed by downcast - no special syntax

    boxing: (box x) - is not implicit

    unboxing: (unbox<T> x), though T is normally inferred from context, then it's just (unbox x)

    conversions: (int x), (float x), (string x) etc

  • I am shaken. I can pretty much guarantee over the past 10 years or so, the longest I've gone without using C# was about 48 hours and yes that includes my wedding in the Bahamas. Yet I have never noticed or have been impacted by the fact that a boxed T can only be cast to T. It's kinda like finding out you were adopted.

  • I would point out that algebra is algebra. There is no 'regular' algebra.

    I would point out that there are infinitely many algebras. An "algebra" is by the pure mathematician's definition simply the combination of a field with a multiplication operator closed over the field such that the operator has certain attractive properties, such as distributivity. So, yes, I did not need to say "regular algebra" here, since by definition a multiplication operator in any algebra is distributive.

    A computer scientist would define "algebra" very differently than a pure mathematician. To a computer scientist, an algebraic system is any system that affords certain symbolic manipulations. There does not have to be a vector space, or a multiplication operator that distributes over addition of vectors. As a computer scientist I think of the type system of C# as forming an algebra because it is a bunch of stuff I can manipulate symbolically. Let's call such algebras "symbolic algebras", and the pure mathematician's algebras "vector algebras".

    What I am calling out here is that our shared understanding of a particular vector algebra (namely elementary algebra over the vector space real numbers) leads us all to have intuitions about the symbolic algebra of the C# type system, intuitionswhich are not accurate. Something that looks textually like a multiplication in a vector algebra does not actually have the distributive property in a symbolic algebra. And thus the title of the blog post: cast operators do not obey the distributive law.

    I decided that it would be a major digression to explain the difference between two kinds of algebras (and I note that these are just two out of many possible definitions of "algebra" used by academics), so I didn't bother to put all this verbiage in the original text. But regardless, there are infinitely many vector algebras, and there are many different definitions of the word "algebra". Perhaps "regular algebra" was not the best choice of words, but I felt that "elementary algebra over the field of real numbers" was a bit excessively verbose for what ought to be a pretty straightforward concept.

    - Eric

  • Now dynamic exists wouldn't  converting to dynamic be a better choice.

    (I release that it may be to late. i.e. the differece between

     var x =  a ? b : 0; )

  • Wait...whatever happened to the concept of a simple if/then statement? Sometimes a compact notation actually degrades legibility.

    Better yet, why not fix the "GetAmount" method so that it actually returns a decimal value (result or zero)?

  • I think it is more accurate to say that compile time resolution of cast syntax into the contextually appropiate cast operator does not obey the distributive law. When it is really the same cast operator applied to both values, the distributive law works.

  • Okay, so whose bright idea was this?

       const int x = 1, y = 2, z = 4, answer = x + y + z;

       Console.WriteLine("the answer is " + answer);

       Console.WriteLine(x + y + z + " is the answer");

       Console.WriteLine("the answer is " + x + y + z);

       Console.WriteLine("the answer is " + (x + y) + z);

       Console.WriteLine("the answer is " + x + (y + z));

       Console.WriteLine("the answer is " + (x + y + z));

       Console.ReadKey();

    Output:

       the answer is 7

       7 is the answer

       the answer is 124

       the answer is 34

       the answer is 16

       the answer is 7

    Associativity of '+' in C# - mycodehere.blogspot.com/.../associativity-of-in-c.html

  • Evaluation order in C# is left to right. blogs.msdn.com/.../4374222.aspx

  • @Joren: John's complaint has nothing to do with evaluation order; it's about operator precedence and associativity.

  • @Joren: John's complaint has nothing to do with evaluation order; it's about operator precedence and associativity.

  • @Josh: In fact, you *can* cast from a boxed T to other types - sometimes.

    In particular, you can cast from a boxed integral type to an enum which has that type as its underlying type - and vice versa.

    I can't remember whether that's guaranteed by the language spec or not - I seem to remember that at one point both the CLI spec and the language spec made some attempt to talk about it, but both ended up being slightly nonsensical in different ways.

  • I agree with Shiva. The compiler should do a better job for the type case for the “0”, or do not compile the code.

    If the compiler only evaluate the expression “isDecimal ? result : 0”, it is true the best type is “object” for the the “0”.  However, if the compiler evaluates the whole expression “decimal amount = (decimal)(isDecimal ? result : 0)”, the best type should be chosen is “decimal”.

    The runtime exception is generated by the compiler, not the person who wrote the code.

    OK, what about this?

    void M(decimal x) { }
    void M(int x) { }
    void M(object x) { }
    ...
    M(isDecimal ? result : 0);

    Describe the exact semantics you would like to see here. Now we have three conversions, one to decimal, one to object, and one to int. Which one is correct? How should the compiler decide?

    Once you've solved that one, try this one:

    void N(Func<decimal> f) { }
    void N(Func<object> f) { }
    void N(Func<int> f) { }
    ...
    M(()=>isDecimal ? result : 0);

    Is that a function to int, to decimal, or to object?

    Reasoning from outside to inside is very difficult; we try to ensure that expressions in C# can be analyzed independently of their contexts, because the semantic analysis of the context could be the thing we're trying to figure out. We don't want to add more chicken-and-egg problems to C#.

    - Eric

Page 1 of 2 (23 items) 12