Chaining simple assignments is not so simple

Chaining simple assignments is not so simple

Rate This
  • Comments 46

UPDATE: I interrupt this episode of FAIC with a request from my friend and colleague Lucian, from the VB team, who wonders whether it is common in C# to take advantage of the fact that assignment expressions are expressions. The most common usage of this pattern is the subject of this blog entry: the fact that "chained" assignment works at all is a consequence of the fact that assignments are expressions, not statements. There are other uses too; one could imagine something like "return this.myField = x;" as a short cut for "this.myField = x; return this.myField;" -- perhaps we are performing some computation and then recording the results for use later. Or perhaps we've got something like myNonNullableString = (myNullableString = Foo()) ?? "<null>"; -- there are any number of ways this idiom could be used.

I do not use this idiom myself; I'm of the opinion that side effects such as assignments are best represented by putting each in a statement of its own, rather than as something embedded in a larger expression. My question for you is: do you use assignments as expressions? If so, how and why? Note that I am looking for mundane, "real world" examples of this pattern, not clever ideas about how this could in theory be used. If you've got one, please leave it in the comments and I'll pass it along to Lucian. Thanks!

*********************

Today I examine another myth about C#. Consider the following code:

a = b = c;

This is legal; you can make arbitrarily long chains of simple assignments. This pattern is most often seen in something like

int i, j, k;
i = j = k = 123;

I often hear that this works “because assignment is right-associative and results in the value of the right-hand side”.

Well, that’s half true. It is right-associative; obviously this has to be equivalent to

i = (j = (k = 123)));

It doesn’t make any sense to parenthesize it from the left. Now, in this particular example, the statement is true, but in general it is not. The result of the simple assignment operator is not the value of the right hand side:

const int x = 10;
short y;
object z;
z = y = x;
System.Console.WriteLine(z.GetType().ToString());

This prints “System.Int16”, not “System.Int32”. The value of the right-hand side of “y = x” is clearly an int, but we do not assign a reference to a boxed int to z, we assign a reference to a boxed short!

So then is the correct statement “… results in the value of the left-hand side”?

Nope, that’s not right either, and we can prove it.

class C
{
  private string x;
  public string X {
    get { return x ?? ""; }
    set { x = value; } }
  static void Main()
  {
    C c = new C();
    object z;
    z = c.X = null;
    System.Console.WriteLine(z == null);
    System.Console.WriteLine(c.X == null);
  }
}

This prints “True / False” – the result of the assignment operator is not the value of the left-hand-side. The value of the left hand side is the empty string but the value of the operator is null.

Heck, the left hand side need not even have a value. Write-only properties are weird and rare, but legal; if there were no getter then the left hand side c.X would not have a value!

The correct statement should now be pretty easy to deduce: the result of the simple assignment operator is the value that was assigned to the left-hand side.

 

  • It's not really surprising, in that it is fully consistent with C, C++ and Java, while sharing the same syntax. That makes sense to me.

    Those three all define the result of the assignment operator as "the value of the variable after assignment", but of course they also don't have properties. So far as I can see, when properties are not involved, there's no difference between "new value of variable", and "assigned value converted to type of variable", so where they intersect with C#, semantics are consistent; and otherwise the rule is a logical extension, preserving the spirit while taking into account the existence of write-only properties.

  • Consider slightly changed piece of code:

    class C

    {

     private string x;

     public string X {

       get { return x ?? ""; }

       set { x = value + "a"; } }

     static void Main()

     {

       C c = new C();

       object z;

       z = c.X = "b";

       System.Console.Write(z);

       System.Console.Write(c.X);

     }

    }

    It writes "b" and "ba".

    Shouldn't the statement be rather " the result of the simple assignment operator is the value that was >used to be< assigned to the left-hand side" ?

    Can you think about another wierd example? :)

  • Fusion,

    Whe dealing with properties, "the value that was assigned" has to be treated as "the parameter that was passed to the setter", and NOT as "the internal value of any backing field or calculation". Actions that take place within the setter (and getter) are not (and IMPO should not) be considered as they are internal implementation details.

  • "I'm of the opinion that side effects such as assignments are best represented by putting each in a statement of its own, rather than as something embedded in a larger expression."

    Agreed. I need to be able to focus on more important problems in my code than what the value of "i += ++i + i++;" will be.  I'd rather communicate my intentions and perform one operation per line of code and let the compiler figure out how to optimize it.  It'll probably do a better job than I could anyway.

    Unless there is a case where the compiler would generate different or more efficient code because you wrote something like:

    int a = b = c = 1;

    as opposed to:

    int a = 1;

    int b = 1;

    int c = 1;

    I don't see much reason to perform multiple operations (excluding basic get/set operations) in a single line.  I find it just muddles things up and makes it more difficult to figure out the order of execution and why it needs to be executed that way.  I think both should always be made as obvious as possible, but the "why" is allowed some more wiggle room.

  • in C and Perl, I've frequently used the pattern of

    if (x = openSomethingOrOther()) {

     do something with x

     closeSomethingOrOther(x);

    }

    Works in JavaScript too but I don't remember whether or not I've used it.

    In Java and C# that becomes less useful since non-bool expressions can't be used as the condition for if statements.

    I've also used something like:

    while ((str = stream.ReadLine()) != null) {

     ...

    }

    and similar tricks using a character variable to read from a stream up to a terminating character. I think it's almost always ugly code, and as I've moved exclusively to C# and the language and the BCL have gotten better at providing cleaner ways to do these things at a nicer level of abstraction (foreach (var line in File.ReadAllLines(...)) for example) I've used it less and less.

    That help?

  • real world using assignments as expressions / chained assignments      

     protected virtual void OnGetWindowSizes(ref short minimumWidth, ref short minimumHeight, ref short maximumWidth, ref short maximumHeight, ref short preferredWidth, ref short preferredHeight)

           {

               if (WidthAt96DPI != 0 && HeightAt96DPI != 0)

               {

                   using (System.Drawing.Graphics g = CreateGraphics())

                   {

                       short scaledWidth = (short)(WidthAt96DPI * g.DpiX / 96);

                       short scaledHeight = (short)(HeightAt96DPI * g.DpiY / 96);

                       if (AllowUserToResizeTool)

                       {

                           minimumWidth = preferredWidth = scaledWidth;

                           minimumHeight = preferredHeight = scaledHeight;

                       }

                       else

                       {

                           minimumWidth = maximumWidth = preferredWidth = scaledWidth;

                           minimumHeight = maximumHeight = preferredHeight = scaledHeight;

                       }

                   }

               }

           }

    refs due to some C++ interop.

  • I don't make a habit of it in the general case, but I can think of one pattern I use that takes advantage of this feature. Consider the following pseudo-code:

    START:

    Get value

    IF value satisfies some predicate THEN

      Use value in additional processing

      GOTO START

    END

    The easiest way (IMHO) to express this is with code similar to the following:

    string input;

    while(input = GetInputFromUser() != "quit")

    {

      ProcessUserInput(input);

    }

    You could use a boolean return value and an out variable instead to avoid evaluating the assignment as an expression, but that is considerably less usable and readable in my opinion.

    Also, it seems to me that following usage of the "using" keyword would fall into this category:

    SqlConnection conn;

    using(conn = new SqlConnection(...))

    {

    ...

    }

    Although I can't be sure the language spec doesn't special case this scenario, as it would have to if the declaration of "conn" were inside the using expression.

  • Pretty much the only place where I ever use this pattern is when reading a file line-by-line:

      string str;

      while ((str = reader.ReadLine()) != null) { ... }

    The reason is that alternatives are either repeating the call to ReadLine twice, or using while(true), with the assignment and if/break inside. I don't like either.

  • I generally agree with the sentiment to keep side-effects separate.  But, like other responders, there are a handful of places where I commonly do in fact use the assignment expression value.  Oddly enough, they are mostly similar to Pavel's example, fit the pattern David Nelson describes, and fall into the broader category of i/o operations.

    StreamReader.ReadLine() returns null reaching the end of input, Stream.Read() returns 0, likewise Socket.Receive(), in a console application, I might loop until Console.ReadLine() returns "", etc.  Making these checks as the condition at the top of a "while" loop results in code that is IMHO more readable than the alternatives.

    Much less commonly, I do find myself doing a similar kind of thing in "if" statements.  In some respects, those examples can probably be thought of degenerate, single-iteration versions of the "while" loop scenario, though in the "if" statement examples, I would say that the i/o scenario isn't quite so highly correlated.

    In short, it's not a construct I use broadly.  But if C# didn't allow assignments to be treated as expressions, I would definitely miss that feature.

  • Sather has an interesting form of loops which lets you write this in a readable way, by allowing "while" (or other iterator - it's actually extensible, and "while" is not a keyword) itself to appear in the middle of the body:

      loop

         s: STR := #IN.get_line

      while!(s /= void)

         ...

      end

    It's not really all that different from if/break at that point, but still more clear in intent, IMO.

  • In the case of IO operations and loops, I think I'd rather see processing look more like this:

    while(!stream.EndOfStream)

    {

       string s = stream.ReadLine();

    }

    Using the magic return value of null to indicate EOF is unclear because it requires consumers to have knowledge of that return value.  I think this would also eliminate the problems Pavel mentions with calling Read() twice and using a while(true)/break construct.

    Of course, there is not currently a Stream.EndOfStream property, nor do I have any idea what would be required to make that happen, but a guy can dream, right?

  • "Using the magic return value of null to indicate EOF is unclear because it requires consumers to have knowledge of that return value."

    For better or worse, we're stuck with that.  Many APIs have no way to even know whether they've reached the end-of-input without a read operation.  For example, network sockets.  Your code may have read all available data without reaching the end-of-input and be sitting there with another blocking read.  And that next blocking read could be the one where end-of-input is reported, much later (i.e. at the time all the data was consumed, it wasn't yet known that the end-of-input had been reached).

    Sure, you could refactor the code so that it could (for example) handle a zero-byte input properly before going back and checking the special "end-of-stream" property.  But that's a lot of overhead for little practical benefit.

    I suppose if we could design a complete computing environment from the ground up, it could be designed such that input streams always have a known end-of-input that can be identified without trying to read more input.  But a) it's not clear that such an environment would in fact be a practical improvement on the current situation, and b) obviously it's simply not practical to do that anyway.  New computing systems have to be able to operate with existing ones.

    And even if we did somehow overcome all those practical obstacles, we'd still be stuck with the fact that not ALL uses of assignments as expressions with value fall into that category.  Even if you could get rid of the end-of-input-as-part-of-a-read scenario, we'd still have places where it would be useful to evaluate the outcome of an assignment operation.

  • When I'm forced to switch on the type of a value, I use the following pattern:

    private static IPropertyWriter GetPropertyWriter(Object value)

    {

       String str;

       IEnumerable enumerable;

       Pair pair;

       Triplet triplet;

       if ((str = value as String) != null)

       {

           return new StringWriter(str);

       }

       else if ((enumerable = value as IEnumerable) != null)

       {

           return new EnumerableWriter(enumerable);

       }

       else if ((pair = value as Pair) != null)

       {

           return new PairWriter(pair);

       }

       else if ((triplet = value as Triplet) != null)

       {

           return new TripletWriter(triplet);

       }

       else

       {

           return null;

       }

    }

  • "When I'm forced to switch on the type of a value, I use the following pattern:"

    Taking as granted that you might indeed be forced into that kind of logic (I would say that generally one should try to avoid having to have conditions that depend on the specific type), it seems to me that a more maintainable approach would be to set up a data-driven framework to do that kind of work.  For example:

           static IPropertyWriter GetPropertyWriter(Object value)

           {

               return GetMappedObject(value, _rgmm);

           }

           static MethodMap<IPropertyWriter>[] _rgmm = new MethodMap<IPropertyWriter>[]

           {

               new MethodMap<IPropertyWriter>(typeof(string), (obj) => new StringWriter((String)obj)),

               new MethodMap<IPropertyWriter>(typeof(IEnumerable), (obj) => new EnumerableWriter((IEnumerable)obj)),

               new MethodMap<IPropertyWriter>(typeof(Pair), (obj) => new PairWriter((Pair)obj)),

               new MethodMap<IPropertyWriter>(typeof(Triplet), (obj) => new TripletWriter((Triplet)obj))

           };

           struct MethodMap<T>

           {

               public readonly Type Type;

               public readonly Func<object, T> Mapper;

               public MethodMap(Type type, Func<object, T> mapper)

               {

                   Type = type;

                   Mapper = mapper;

               }

           }

           static T GetMappedObject<T>(object value, MethodMap<T>[] rgmm)

           {

               foreach (MethodMap<T> mm in rgmm)

               {

                   if (mm.Type.IsInstanceOfType(value))

                   {

                       return mm.Mapper(value);

                   }

               }

               return default(T);

           }

    Once you've got the basic boilerplate above in place, it's a lot simpler to add that kind of mapping than to have to keep writing a bunch of if/else chains.  Just create/add a new element to an array that describes the relationship.

  • Here's some code that chains assignments to set properties while remembering their values for later:

    class GraphicalThing {

           Path line;

           GeometryGroup geom;

           public GraphicalThing() {

               Children.Add(line = new Path() { Data = geom = new GeometryGroup { FillRule = FillRule.Nonzero } });

    ...

    I think that's the only time I use chained assignments in C#.

Page 2 of 4 (46 items) 1234