FYI, C# 2.0 Has A Breaking Change in Enum Subtraction

FYI, C# 2.0 Has A Breaking Change in Enum Subtraction

  • Comments 16

A customer brought to my attention the other day that the C# 2.0 beta release has a breaking change from the previous release. Namely, this code

enum E : byte {
  A = 1,
  B = 2
};

// . . .

E a = E.A;
E b = E.B;
int j = a - b;

sets j to -1 in the previous release but to 255 in the upcoming release.

First off, let me say that we regret the breaking change. We agonize over all breaking changes because we know the pain that they cause customers. We also regret introducing the bug in the first place, thereby forcing us to choose between continuing to be in violation of the specification and breaking existing code. Sorry about all that.

Second, I should describe why exactly the original behaviour is in violation of the C# specification. It's pretty straightforward. Start with section 7.7.5:

Every enumeration type implicitly provides the following predefined operator, where E is the enum type, and U is the underlying type of E: U operator –(E x, E y); This operator is evaluated exactly as (U)((U)x – (U)y)

That clearly means that the assignment above should have the same semantics as

int j = (byte)((byte)a-(byte)b));

C# defines only four built-in subtraction operators:

int operator –(int x, int y);
uint operator –(uint x, uint y);
long operator –(long x, long y);
ulong operator –(ulong x, ulong y);

There is an implicit conversion from byte to all four types, so we must select the best one.  According to section 7.4.2.3 the int version is the best one (because signed is preferable to unsigned and int goes to long but long does not go to int.)  So what we generate here is the equivalent of:

int j = (byte)((int)(byte)a-(int)(byte)b));

The conversions from E to byte to int will go off without a hitch, and the subtraction will result in an int set to -1.  That then gets cast to byte. What happens when we try to cast a computed-at-runtime integer to a byte? Section 7.5.12 says

For non-constant expressions (expressions that are evaluated at run-time) that are not enclosed by any checked or unchecked operators or statements, the default overflow checking context is unchecked unless external factors (such as compiler switches and execution environment configuration) call for checked evaluation.

Therefore this is an unchecked cast, and -1 goes to 255 as a byte. That then gets converted back to an int during the assignment.

Third, I should talk a bit about the process we go through when making breaking changes like this. The change was made to the C# 2.0 compiler on the 14th of January 2004, six months before beta 1, and one of the reasons we try to push betas out really early is to get feedback on whether breaking changes like this affect millions, thousands, or dozens of people. Since to my knowledge the first customer to run into a break contacted us this week, and we're only taking "the product electrocutes millions of users" bug fixes right now, unfortunately this one does not make the bar for choosing backwards compatibility over correctness. I feel bad about that, but I hope you understand our reasoning here. We've got to ship this thing! We'll also make sure that a Knowledge Base article describing the problem gets written.

 

  • Eric, was this particular customer advocating that the previous behavior be preserved, or just pointing out the change in behavior? If you explained the situation as well as you did here, I'd be inclined to say "Okay, we'll go back and change/check our code." Unless I was a big customer with 300,000 lines of C# that abso-frickin-lutely depended on this behavior, in which case I'd talk to our Microsoft sales guy. :)
  • Unfortunately the customer is in fact using this idiom in numerous places in their code, and was hoping that we would change it back for the final release. I hate to disappoint people, but the Whidbey train is about to leave the station and we can't risk destabilizing the release for anything short of a major security or geopolitical issue.
  • Eric,

    Will there be a FXCop rule to warn users on potential problems?
  • Good idea. I'll pass on that suggestion to the FXCOP team.
  • Of course a subtracting operation on bytes must return a byte. If one wants 1 - 2 to be equal to -1, one explicitly casts at least one of the arguments to int. That’s not a breaking change, it’s a bugfix.
  • "Breaking change" and "bug fix" are orthogonal. In this case we have a bug fix that is also a breaking change -- we've taken a legal program and changed its semantics. That we did so to fix a bug is not hugely relevant to the unfortunate customers who must now track down all the places in their code where they rely on the old, buggy behaviour.
  • I thought "breaking change" is "change that will cause your code to not compile", rather than "change that will alter runtime behaviour in subtle ways". That said, an FXCop rule is probably the best place for a test for this behaviour. Actually, I'm not sure when the line runs between compiler warnings and FXCop (or PREFast) warning.
  • Personally I am all for this. I think especially in C# to continue to be the great language it is and you look back at some of the things that have gone wrong with the languages before one of the things is they do not adhere to the specs. While you might have a breaking change, if you can not expect the language to behave correctly then your shooting yourself in the foot in the long run.
  • Jonathan: I call any change that could in any way break a customer a breaking change. In fact this is a breaking change in your sense too though. int j = (E.A-E.B); used to compile but since the RHS expression involves only constants, this is a checked context, and therefore will give a compiler error when the -1 tries to go to byte.
  • Jeff: I agree, you have to look at the long run.

    One way to look at it is like this: The number of people affected by this change is already small. In the long run the number of affected people will get smaller and smaller. Most new developers will never even use C# 1.0 once 2.0 finally ships, so 1->2 forwards compatibility isn't important to them.

    On the other hand, as the installed base of programs gets larger and larger, more and more of the space of legal language strings is consumed and therefore the likelihood of more problems like this when going from 2->3 goes up, so forwards compat becomes MORE important as time goes on.

    I guess what I'm saying is that in the long run, forwards compatibility for a particular version becomes less important, but forwards compatibility for the current version becomes more important.

    It's a hard problem no matter how you slice it.
  • Well, if people will rely on unspecified behaviour... :)

    (I know, backwards compatibility is a serious issue. That said, I'm more surprised that you're trying to pass off the reason for not regressing to the non-standard implementation on the late stage of development, rather than just appologising for having the bug in the first place)
  • Clearly these kinds of things are judgment calls and opinions vary. Everyone on the C# team believes that spec compliance and backwards compatibility are important goals -- but when those goals are in conflict, different people have different opinions on which one should win in a particular situation.

    When we were working on scripting, there is absolutely no doubt in my mind that we would never have made such a breaking change in the engine semantics without a _much_ better reason than "it's in the spec". The thing with JScript is that there is only one JScript engine on your machine, and when its upgraded, its upgraded. There's no going back to the old behaviour, you just start getting broken pages. Backwards compat was WAY more important than spec compliance in JScript.

    But with C# we have side-by-side compilers, and end users are not compiling and running code, pro devs are. In this scenario, my personal opinion is that spec compliance begins to edge out backwards compatibility.

    Others on the C# team are very much for backwards compat over spec compliance when the two goals conflict. Ultimately these things have to be judgment calls with all constituencies represented in the argument -- C# programmers, end users, tools vendors (who rely on the specification being accurate), and the internal needs of the C# dev/test/PM/user ed teams.





  • PingBack from http://jinfeng10.theblog.net/a/2006/05/03/about_enum
  • We typically address breaking changes by correcting the default behaviour (as you have with C# 2.0) and introducing a command line switch for backwards compatibility as a quick (but temporary) fix for those customers who rely heavily on the old behaviour.

    Not a perfect solution (since it makes your code harder to read and costs time/money to implement), but it keeps the majority of affected customers happy.

    If I had to pick one or the other, I'd have made the same choice.
  • We on the C# team hate making breaking changes. As my colleague Neal called out in his article on the

Page 1 of 2 (16 items) 12