Future Breaking Changes, Part Three

Future Breaking Changes, Part Three

Rate This
  • Comments 21

As I said earlier, we hate causing breaking changes in our product, the C# compiler, because they cause our customers pain.

Said customers are also software developers, and presumably they hate causing breaking changes for their customers as much as we do. We want to throw our customers into the Pit of Success and give them tools which encourage them where possible to prevent breaking changes. This leads to some subtle issues in language design.

Pop quiz. What does this program do?


// Alpha.DLL
namespace Alpha {
  public class Charlie {
    public void Frob(int i) { System.Console.WriteLine("int"); }
    // etc.
  }
}
// Bravo.EXE, references Alpha.DLL.
namespace Bravo {
  public class Delta : Alpha.Charlie {
    public void Frob(float f) { System.Console.WriteLine("float"); }
    // etc.
    public static void Main() {
      Delta d = new Delta();
      d.Frob(1);
    }
  }
}

Most people look at this program and say “clearly Charlie.Frob(int) is the best possible match for the call, so that is called.” A compelling argument, but wrong. As the standard says, “methods in a base class are not candidates if any method in a derived class is applicable".

In other words, the overload resolution algorithm starts by searching the class for an applicable method. If it finds one then all the other applicable methods in deeper base classes are removed from the candidate set for overload resolution. Since Delta.Frob(float) is applicable, Charlie.Frob(int) is never even considered as a candidate. Only if no applicable candidates are found in the most derived type do we start looking at its base class.

Why on earth would we do that? Clearly in this example the base class member is the far better match, so why wouldn’t we even consider it?

It is instructive to consider what happens in a world where we do implement the rule “pick the best applicable candidate from any base”. Suppose we did that.

In the previous version of Alpha.DLL, Charlie did not have a method Frob(int).; When Bravo Corporation wrote Bravo.EXE, every call inside class Delta to method Frob was a call to Delta.Frob(float). Then one day Alpha corporation did customer research and discovered that a lot of their customers like to frob integers. They added this feature in their latest version. Delta corporation gets the new version of Alpha.DLL, recompiles Bravo.EXE, and suddenly their carefully developed code is sometimes calling a method that they didn’t write, which does something subtly incompatible with their implementation.

Alpha corporation has just pushed a breaking change onto Bravo corporation, which, if they don’t catch it in time, may now be pushing a subtly broken version onto their customers in turn, and hey! we’re in the Pit of Despair again!

This particular family of breaking changes is called the "brittle base class problem"; there are many versions of it and different languages deal with it in different ways. Lots of work went into the design of C# to try and make it harder for people to accidentally cause brittle base class problems. That is why we make you distinguish between the original definition of a virtual method and an overriding method. That is why we make you put “new” on methods which shadow other methods. All these semantics are in part to help prevent, mitigate or diagnose brittle base class issues and thereby prevent accidental breaking changes in C# code.

Next time on FAIC: some psychic debugging. Then a bit later I want to talk more about breaking changes, this time in the context of thinking about covariance and contravariance.

  • Eric, operator overload resolution changed in a breaking way between v1 and v2.

    Code compiles in both v1 and v2 but produces different results.

    Was that just a mistake on MSFT's part or was there a reason?

  • Nasty example building on the one in the post - make Frob(int i) virtual in Alpha, and override it in Bravo.

    In this case there can't be a breaking change from Alpha, because we *know* there's a Frob(int i) present (otherwise we wouldn't be able to override it). However, Frob(float f) is still chosen.

    I don't know whether this was deliberate or not, but I doubt that one developer in twenty knows about it. (I didn't until I ran into it in the newsgroups a while ago.)

    Jon

  • Re: Breaking change between v1 and v2: I'm unsurprised to learn that there was such a change. But since that was years before I was on the C# team, I'm unable to determine from the vague description what the issue was. Can you be more specific?

  • Re: overriding a virtual:

    Yes, that is deliberate. A virtual method is considered to be a member of the class which declares it, not a class which overrides it.  

  • The concept of a "breaking change" is non sequitur. As a developer, I currently write code targeting C# 2. In a future version (C# 3 or 4), even if the syntax changes slightly, my code targets C# 2, thus I logically continuing compiling it with a C# 2 compiler. Changes in future versions are immaterial to the version that my code targets.

    If one day I decide to upgrade to a newer version, then I also know that I must port some of my code. Once porting is done, I will need to run all of my unit and functional tests to ensure the port was successful. That is to be expected.

    I suppose you are suggesting a scenario somewhere between these two premises. Such as a developer who writes code targeting C# 2 and one days decides to switch to a future version without the intention or desire to port, test, and verify that the code still works as before. While that is irresponsible, it is simply not a realistic option. Shame on the developer that tries to get away with that.

    A breaking change is a non-concept. If you want to evolve the language, then do so. But "breaking changes" is not an excuse either way.

  • Allan, I understand your point of view, but the telling difference between your experience and my experience is:

    > If one day I decide to upgrade to a newer version,

    If YOU decide.  Like, it's YOUR decision. Though I envy you your freedom, I must suspect that the work you do with the compiler is not as large-scale as a lot of our customers.

    Take me for example. I'm the C# team's best customer. I work in the Developer Division at Microsoft.  There are THOUSANDS of developers, testers, PMs, writers, you name it, all using the compiler that is checked in to our toolset.

    The decision of when to upgrade the tools is certainly not made by me! It is made by our crack squad of build experts who have to manage millions of lines of code, tens of thousands of which change every single work day.  Upgrading to a major new tool release can take weeks, and a single build break as a result can destabilize the integration of man-months of work, screw up our carefully planned schedules and cost huge wodges of liquid cash.

    That is why we take breaking changes in the compiler REALLY FREAKIN' SERIOUSLY.  

    It's all very well for you to wag your finger at us and say that we should just run the unit tests. Of COURSE we run the unit tests when we upgrade the toolset compiler! They take days; we have literally MILLIONS of them.  Imagine the expense if even a single one of them breaks because of a breaking change in the compiler. Any such a break represents a huge number of dev, test, and PM man-days taken away from other serious issues that we face.

    We in devdiv cannot afford breaking changes that affect real production code, and neither can our customers whose large-scale software projects depend on their code continuing to work when they upgrade the compiler.  What you casually handwave away as "porting some of my code" can translate into the difference between a product being profitable or not.

    Like I said, we want people to upgrade because we believe that the new stuff we're inventing is really compelling. We think that it will save developer time, make code more easily maintained, and ultimately cost fewer customer company dollars, thereby driving profit and value into the economy. Breaking change work directly against that goal, sucking value out of the economy. We must make sure that the compelling benefit is high compared to the potentially enormous cost of breaking changes, otherwise we're not doing anyone any favours by developing this cool new stuff. Handwaving away these enormous expenses as "non sequitur" is a little short sighted, no?

  • Really enjoying this series of posts Eric - please keep 'em coming. I can't help but wonder it today's edge cases and dark corners (in which one can hide breakages) aren't going to become, if not common place, certainly more frequent over the next five years as we transition from a purely OO to a hybrid Functional-OO ("Foo"?) / declarative style of coding. You folks have certainly got your work cut out for you keeping the rate of innovation going whilst not breaking too much. At least until someone decides that C# is to all intents and purposes "Done." (personaly with C# 3.0 I think we've very close to done - maybe a dash of Spec# in the next version - but still, we're close...)

  • I'm glad you enjoy the posts.

    But done?  No way.

    In the fall of 1993 I applied to be an intern at Microsoft. One of the promotional brochures that the recruiters gave me had a picture of Scott Wiltamuth, then a PM on the VB team. The caption was something like "Have we implemented all the cool language features yet? No way, we've barely scratched the surface."

    Scott (who incidentally is now my manager's manager's manager's manager) got a lot of teasing about being the "VB Poster Boy" over the years, but he was right then and the sentiment is still right now. There is so much more we can do with this language.  Ideas from Spec# are interesting, yes. Can we make contracts first-class in the language? Should we extend the type system?  Maybe!

    What about all the real-world feedback we are getting about the power and limitations of dynamic languages? Can we learn from that and make C# better? Maybe!

    What about metaprogramming? There are powerful, horrible things I can do in C++ that are hard to do in C# because C++ has a (terrible!) metalanguage built in.  Can we learn from the successes and failures of that and design a sensible metalanguage with all the power but none of the drawbacks of the C++ metalanguage?  Maybe!

    What about design patterns? Design patterns only exist because they make up for a deficiency in a language. If C# had double virtual dispatch then the visitor pattern would be trivial. Can we look at common design patterns and come up with more powerful abstractions behind them?  Maybe!

    What about all the ideas coming out of research languages?

    There is so much we can still do here.  That surface is still only slightly scratched.

  • I am almost sure something changed in some KB update (not even SP)

    So I cann't reprodue the problem. :(

    The difference I still see is this (compiles in v1 and doesn't in v2)

    class Variant

    {

    public static implicit operator bool(Variant v){return false; }

    public static implicit operator Variant(bool b){return new Variant(); }

    public static Variant operator|(Variant v1, Variant v2){return new Variant(); }

    public static bool operator|(Variant v1, bool v2){return false;}

    }

    class UpgradeAndBeHappy

    {

    static void Main(string[] args)

    {

    Variant a = new Variant();

    if( a || true)

  • >> If one day I decide to upgrade to a newer version,

    >If YOU decide.  Like, it's YOUR decision.

    I was not clear, but I did not intend to limit the statement to myself as a single individual.

    In fact, my team has had this experience several times. When the manager decided that we were going to move from VS6 to VS2003, it took almost four months to completely port and verify most of our C++ codebase. There were a few projects, though, that we deemed would take too long to port and they remained VS6 projects for much longer. The process was then similarly repeated with the move to VS2005 as well.

    If I or even my manager decides that we are to move to a newer version, then appropriate time is needed for porting and verification. That may take days or even months depending on the projects. If the transition will take too long, then revert back to the previous working edition. Again, you may not be the person to decide such a thing. In which case the manager will need to either 1) allocate appropriate time for transition or 2) revert to the previous working edition. A manager who does not understand this is not in the right job.

    You may disagree, but I stand by my original statement and still assert that "breaking changes" are immaterial to the development process.

    So I am not worried about breaking changes. In fact, I expect them, no matter how careful you are to minimize them. If you believe a feature is compelling and valuable, then introduce it. Whatever necessary porting will be done during the appropriate transition phase.

  • Well then where you see me say "breaking change" read "a change which greatly increases the cost of porting to a new version of the compiler", if that's how you prefer to think of it.  We want to keep your porting costs down, and that seems very "material" to me.

  • I can understand both sides of the argument. However, Allan makes the stronger and most realistic case. A product developed under one version needs to be ported to a new version. There are no exceptions. No matter how much you try to minimize "breaking changes", there will always be some. Not that I recommend it, but the only way to truly minimizing porting cost is to freeze the compiler.

    Of course there are many alternatives. Off the top of my head...

    1) Command line arguments enabling / disabling features

    2) Command line argument to set the compiler to compile for a specific version.

    3) Release multiple compilers targeting multiple versions.

    Part of the porting stage often involves rewriting code to use the latest features. When C#2 was released, my team spent time rewriting code to use generics. In the process, a few unintentional bugs were discovered and the overall code was improved.

    I think that you are making too much of a big deal out of this. In the interest of evolving the language for the better, break as much code as you feel is necessary. The world is not static, and the more that you shy away from changes, the more everyone else will pass you by.

  • I'm going to put in a vote for Eric's side of the coin.

    I don't mind too much if old code doesn't compile with the new compiler, so long as it's in relatively rare cases.

    I *do* mind if old code compiles but behaves differently. Obviously the C# team have done all they can to flag such cases with warnings, but unfortunately on some code bases an extra warning or two may well be missed for a while. (Yes, I know that's pretty poor, but it's reality.)

    Just using an old compiler because the cost of porting is too high freezes the developer into not using any new features, of course. Minimizing the cost of porting seems like the right way to go IMO.

    Jon

  • I have to agree with Allan and Phil.

    Breaking changes are just a reality to the business that we are in. I think that Allan makes an excellent point pointing out that code is written for a specific version and that moving to a new version indeed requires porting. Minimize the time required to port as best as possible, but the reality is that some time will always be necessary. I would expect warnings if my old code will not function the same as before. However, unit tests should just as easily tell me the same thing. Breaking changes really should not be used as an excuse for not implementing desired features.

  • Eric,

    A problem that is almost identical to the brittle 'base class problem' has in fact be introduced into the language, in the way extension methods are implemented.  I'd be very curious to know why Microsoft avoided the problem in overload resolution, but have not avoided it in resolving extension methods.  Details are here: http://dotnet.agilekiwi.com/blog/2006/04/extension-methods-problem.html

Page 1 of 2 (21 items) 12