Chaining simple assignments is not so simple

Chaining simple assignments is not so simple

Rate This
  • Comments 46

UPDATE: I interrupt this episode of FAIC with a request from my friend and colleague Lucian, from the VB team, who wonders whether it is common in C# to take advantage of the fact that assignment expressions are expressions. The most common usage of this pattern is the subject of this blog entry: the fact that "chained" assignment works at all is a consequence of the fact that assignments are expressions, not statements. There are other uses too; one could imagine something like "return this.myField = x;" as a short cut for "this.myField = x; return this.myField;" -- perhaps we are performing some computation and then recording the results for use later. Or perhaps we've got something like myNonNullableString = (myNullableString = Foo()) ?? "<null>"; -- there are any number of ways this idiom could be used.

I do not use this idiom myself; I'm of the opinion that side effects such as assignments are best represented by putting each in a statement of its own, rather than as something embedded in a larger expression. My question for you is: do you use assignments as expressions? If so, how and why? Note that I am looking for mundane, "real world" examples of this pattern, not clever ideas about how this could in theory be used. If you've got one, please leave it in the comments and I'll pass it along to Lucian. Thanks!

*********************

Today I examine another myth about C#. Consider the following code:

a = b = c;

This is legal; you can make arbitrarily long chains of simple assignments. This pattern is most often seen in something like

int i, j, k;
i = j = k = 123;

I often hear that this works “because assignment is right-associative and results in the value of the right-hand side”.

Well, that’s half true. It is right-associative; obviously this has to be equivalent to

i = (j = (k = 123)));

It doesn’t make any sense to parenthesize it from the left. Now, in this particular example, the statement is true, but in general it is not. The result of the simple assignment operator is not the value of the right hand side:

const int x = 10;
short y;
object z;
z = y = x;
System.Console.WriteLine(z.GetType().ToString());

This prints “System.Int16”, not “System.Int32”. The value of the right-hand side of “y = x” is clearly an int, but we do not assign a reference to a boxed int to z, we assign a reference to a boxed short!

So then is the correct statement “… results in the value of the left-hand side”?

Nope, that’s not right either, and we can prove it.

class C
{
  private string x;
  public string X {
    get { return x ?? ""; }
    set { x = value; } }
  static void Main()
  {
    C c = new C();
    object z;
    z = c.X = null;
    System.Console.WriteLine(z == null);
    System.Console.WriteLine(c.X == null);
  }
}

This prints “True / False” – the result of the assignment operator is not the value of the left-hand-side. The value of the left hand side is the empty string but the value of the operator is null.

Heck, the left hand side need not even have a value. Write-only properties are weird and rare, but legal; if there were no getter then the left hand side c.X would not have a value!

The correct statement should now be pretty easy to deduce: the result of the simple assignment operator is the value that was assigned to the left-hand side.

 

  • Interesting!

    const int x = 10;

    short y;

    object z;

    z = y = x;

    System.Console.WriteLine(z.GetType().ToString());

    Shoud work them

  • Maybe there's some compiler-internal aspect here that is more interesting, but I take issue with the statement that "The value of the right-hand side of “y = x” is clearly an int".

    In particular, I expect the value of the right-hand side of an assignment expression to be exactly the thing that is stored into the left-hand side.  Thus, the right-hand side is _not_ "clearly an int".  It seems clear that, given the destination of the assignment operation, that the type of the right-hand side must be a short.

    True, there is implicit type conversion going on in order to allow that.  Likewise this example:

     string str = "Hello";

     object obj;

     obj = str;

    The right-hand side of the assignment must be System.Object, because that's the type of the destination.  But of course, we can implicitly convert System.String to System.Object without extra code.

    As I said, perhaps to the compiler, the right-hand side really is some other type, and there's something about the internals of the compilation process where it's important to state unequivocally that the type isn't necessarily that of the destination.  But from the point of view of the programmer without intimate knowledge of the inner workings of the compiler, it's not clear at all to me that we must consider the type of the right-hand side to be other than that of the destination type.

  • Pete, I have to disagree with your post. Consider (All "T"'s are types...)

    T1 a = value;

    T3 c = a;

    This can be equivilant to

    T1 a = value;

    T3 F1(T1 arg) {...} Converts T1 to T3

    T3 c = F1(a);

    Or Even

    T1 a = value;

    T2 F1(T1 arg) {...} Converts T1 to T2

    T3 F2(T2 arg) {...} Converts T2 to T3

    T2 b = F1(a);

    T3 c = F1(b);

    In the aove sample, it is clear that the contents of "a" are of type T1, and that the contents of "c" are of type T3.

    By logical extension, the RHS of the original statement is also of type T1 and only of type T1. Remember we are looking at the Right Side ONLY, ignoring any conversions (as are explicitly shown in the other samples) and NOT looking at the assignment statement.

    As a result, there is a world of difference between "the type of the RHS" and the "type that is assigned to the LHS" (which is NOT necessarily the declared type of the LHS!!!!

    Hopefully this clears things up...

  • Interesting, of course this leads to the following unexpected behavior:

    class Foo

    {

      public int Bar { get { return 20; } set { } }

    }

    Foo f = new Foo();

    int baz = f.Bar = 10;

    Console.WriteLine(baz); // Prints 10;

  • @Robert Davis

    I disagree. What would be completely unexpected is that baz equals 20. If I have a chained assignment the last thing I'd expect to see is that the result of said expression is a value that doesn't even show up.

    The unexpected behaviour there is simply caused by a wrong property setter. As a matter of fact, shouldn't the compiler flag a empty setter property at least as a warning?

  • @Grico

    I disagree, properties are a different animal than fields/local variables.

    Essentially Eric's post shows that

    int baz = f.Bar = 10;

    has an entirely different effect than

    f.Bar = 10;

    int baz = f.Bar;

    which I think most would find unexpected.

    Also, nothing wrong with an empty setter, especially if you're implementing an interface or sub-classing.

  • I dont see how empty setters could help subclassing or implementing an interface. The way I see it if your setter doesnt do anything then dont implement one. If you need to because you are impelementing an interface but the best choice is to leave it empty then something is wrong with your design. As a last resort I'd throw a not implemented / not supproted exception. A setter that doesnt do anything is misleading and will only lead to unexpected behaviour as in your example.

    And yes of course that

    int baz = f.Bar = 10;

    has an entirely different effect than

    f.Bar = 10;

    int baz = f.Bar;

    one is a chained assignment, the other one isnt. The point is that intuitively when I see a chained assignment I expect everything to end up with the same value. Thats the whole point of a chained assignment and  C#'s implementation does a great job in trying to do exactly that. By passing on the value assigned to the left hand side intermediate property getters are always skipped. This is great because you can never ensure that the getter will return the same value that was passed to the setter which defeats the purpose of a chained assignment.

    My gripe with empty setter doesnt really apply here at all now that I think about it, as the unexpected behaviour would apply anytime we have a property that gets a value that is different to the value that is passed to its setter.

  • Ouch. I guess I'm glad it works this way. How batty would you go trying to figure this out otherwise...

    z = c.X = null;

    //bunches of code

    if (z == null)

    {

    //why do I never get here?

    }

  • Sorry David…you claim to disagree, but I don't see how your example demonstrates that.  If anything, you are simply proving my point, by pointing out that an assignment is _equivalent_ to converting to the necessary type and then copying the resulting value.

    The fact that when assigning "a" to "c" that conversion isn't explicitly in code doesn't mean it's not there.  The compiler does implicit conversions all the time in all sorts of situations.

    In fact, Eric's closing statement – "the result of the simple assignment operator is the value that was assigned to the left-hand side" – is IMHO simply a reiteration of my point.  That is, the value of the right-hand side of the assignment operator is in fact the thing that is assigned to the left-hand side, _after_ any necessary conversion.

    I realize it's a semantic argument, at least absent any specific compiler implementation details.  Hence my equivocation on that point in my first reply.  But it's my opinion that stating from the outset that "clearly" the right-hand side is _not_ the same type as the destination of the assignment is just a straw man, in that is assumes something that I don't take to be true.  That assertion is not at all obviously clear to me; it's entirely dependent on how you interpret the assignment operation.

    I happen to interpret the assignment operation as a simple copy from one place to another, with any type conversion required taking place before the actual operation.  In this respect, it's as if the assignment operator worked like every other overloadable operator, where the assignment operator itself doesn't do any work until the compiler has resolved all the necessary type conversions.  For example, consider a trivial (hypothetical) overload, that would be equivalent to the default implementation of the operator:

     class Foo

     {

       public static operator=(out Foo foo1, Foo foo2)

       {

         foo1 = foo2;

       }

     }

     class Bar

     {

       public explicit operator Foo(Bar bar)

       {

         return new Foo();

       }

     }

     Bar bar = new Bar();

     Foo foo;

     foo = bar;

    This is equivalent to:

     operator=(out foo, bar);

    Which is equivalent to:

     operator=(out foo, explicit operator Foo(bar));

    In other words, by the time the assignment operator = gets the operands, they've already been resolved/converted as needed.

    Of course, the assignment operator isn't overloadable, so the above is purely hypothetical.  But IMHO it's certainly one valid way to interpret the assignment operator, and doing so makes clear that the type of the value on the right-hand-side of the operator must already be the correct type for the destination (the left-hand-side).

    I understand the point you're trying to make and I think it is plausible, but ultimately not productive to consider "the value of the right hand side" as being the value after the conversion to the type of the left hand side.

    We would like to write a specification that states what the value of any expression is. (For those expressions that have a value. The expression "Console.WriteLine(123);" has no value.) It is awkward in the extreme to have to work out what the value of an expression is if its type depends on stuff outside of the expression. (We are in this situation with lambda expressions, which have neither values nor types, and are thereby tricky to specify.)

    In particular, it is extremely awkward to write the "conversion" section of the specification. For example, we say that an implicit boxing conversion exists between a expression whose value is of a value type and the type object. You say "object x = 2 + 2;" Analyze it.

    My way: the value of 2 + 2 is the integer 4, a value type, and therefore this is a boxing conversion to object.

    Your way: the type of the value of the right hand side of an assignment expression is of the type of the left-hand side. Therefore the value of the right hand side is of type object, a reference type, and therefore this is not a boxing conversion.

    That's crazy. We need to be able to talk about the type of an expression *before* any conversion happens because that's how we're going to work out which conversion, if any, is valid. If we go with your way, and say that the type of the right hand side of an assignment is always of the type of the left hand side, then how do we word the specification clearly to figure out what conversion it is?

    Also, consider overload resolution. When you say void M(double x) {} and have a call M(123), essentially what you are doing is assigning 123 to formal parameter variable x. Now suppose we have both M(double x) and M(short x). Are you telling me that the type of 123 is both double and short in M(123) because it could be assigned to double or it could be assigned to short? Again, that's crazy. We need expressions that have a value and type to have a specific value and type independent of their context, because *the correct context is what we're trying to work out*.

    -- Eric

     

  • I have always been so curious of this type of assignment and have sparingly used it because I didn't want to use it without understanding the assignment.

    Thanks for the clarification, Eric!

  • I think I understand how the "value" side of a chained assignment works. What I'm, still trying to decypher is how the types are passed and if it abides the same rules.

    The example Eric uses:

    const int x = 10;
    short y;
    object z;
    z = y = x;

    I dont quite see how this example plays according to the "value assigned to the left hand side" rule.

    y=x; x is clearly an int, therefore y=x should evaluate to what is assigned to y which is an  int.

    Then z should be assigned an int too but it somehow ends up with a short. I'm either missing something basic or types dont follow the same rule values do.

    My guess is that z=y=x is under the hood converted to z=y=(short)x, so the conversion is before the assignment.

    The operation of the assignment operator is (1) evaluate  the left hand side to determine the location of the variable (or property, or whatever). (2) evaluate the right hand side. (3) convert the result of step 2 to the type of the left hand side via the appropriate conversion, (4) assign the result of step 3 to the result of step 1.  The value of the right hand side is the result of step 2. The value assigned to the left hand side is the result of step 3. Those can be very different. Pete seems to believe that we should consider the value of the right hand side to be the value computed by step 3, but I disagree.  -- Eric 

  • z=y=x is converted to z=(object)(y=(short)x) is what i meant to say. So basically z=(object)(short)x. Is this true? Or am i misunderstanding everything?

    Correct. -- Eric

  • I note that I'm not the compiler implementor, nor the language designer, and obviously not the one with expert knowledge in this field.  That said, I can still answer questions asked of me.  :)

    Q: "If we go with your way, and say that the type of the right hand side of an assignment is always of the type of the left hand side, then how do we word the specification clearly to figure out what conversion it is?"

    A: I would look to some kind of recursive definition of the conversion, such that the conversion is defined in terms of simpler conversions.

    Q: "Now suppose we have both M(double x) and M(short x). Are you telling me that the type of 123 is both double and short in M(123) because it could be assigned to double or it could be assigned to short?"

    A: No, I'm not saying that.  But your question stems from a chosen approach to overload resolution, presupposing that choice.  It seems to me that overload resolution could be defined to take into account the question of conversion, such that overload resolution includes an attempt to convert the expression to each of the available destinations for the assignment, and the "best" overload is in fact a consequence of the "best" conversion.

    That said, it seems to me that your reply is really saying that this isn't just about the _compiler_ implementation, but rather about the language specification.  Inasmuch as in the specification you do have to choose a specific way to describe these conversions, overload resolutions, etc. I am perfectly satisfied with _that_ justification for this particular way to look at the question.

    In other words, given a specific choice with respect to how the language in the specification is constructed, I see how my argument doesn't apply.

    But, that doesn't mean that it's a technically impossible argument (the specification could have been worded differently), and as pedantic as it might be to do so, I do still disagree with the characterization of the consequences of the wording of the specification being "clear", at least in absence of a specific reference to that specification (i.e. until you mention the specification, you can't say a particular interpretation of the specification is "clear" :) ).

    Anyway, thanks for setting me straight.

    Like I said, your argument is plausible. But it turns out to be less vexing to describe the operation of the assignment as having two clearly distinct steps in the middle of its operation: computing the value of the right hand side, and computing the value that is assigned to the left-hand side. And of course, that is what really happens in the runtime; we compute the right hand side, then we run the appropriate conversion code, and then we do the assignment. -- Eric

  • Grico,

    You are confusing "the value of the RHS" and what happens as part of an assignment. When talking about RHS values, it refers to the value BEFORE any effects of the operation. Looking at it another (semantic) way. "x" is the RHS alll by  itself. At no time does "x" have any other value or type (i.e. it is NOT mutated by the operation that is being performed.

  • The number one rule of programming is write clearly.  If the original code had been written clearly in the first place there would be no need for this discussion.

Page 1 of 4 (46 items) 1234