Why do ref and out parameters not allow type variation?

Why do ref and out parameters not allow type variation?

Rate This
  • Comments 19

Here's a good question from StackOverflow:

If you have a method that takes an "X" then you have to pass an expression of type X or something convertible to X. Say, an expression of a type derived from X. But if you have a method that takes a "ref X", you have to pass a ref to a variable of type X, period. Why is that? Why not allow the type to vary, as we do with non-ref calls?

Let's suppose you have classes Animal, Mammal, Reptile, Giraffe, Turtle and Tiger, with the obvious subclassing relationships.

Now suppose you have a method void M(ref Mammal m). M can both read and write m. Can you pass a variable of type Animal to M? No. That would not be safe. That variable could contain a Turtle, but M will assume that it contains only Mammals. A Turtle is not a Mammal.

Conclusion 1: Ref parameters cannot be made "bigger". (There are more animals than mammals, so the variable is getting "bigger" because it can contain more things.)

Can you pass a variable of type Giraffe to M? No. M can write to m, and M might want to write a Tiger into m. Now you've put a Tiger into a variable which is actually of type Giraffe.

Conclusion 2: Ref parameters cannot be made "smaller".

Now consider N(out Mammal n).

Can you pass a variable of type Giraffe to N? No. As with our previous example, N can write to n, and N might want to write a Tiger.

Conclusion 3: Out parameters cannot be made "smaller".

Can you pass a variable of type Animal to N?

Hmm.

Well, why not? N cannot read from n, it can only write to it, right? You write a Tiger to a variable of type Animal and you're all set, right?

Wrong. The rule is not "N can only write to n". The rules are, briefly:

1) N has to write to n before N returns normally. (If N throws, all bets are off.)
2) N has to write something to n before it reads something from n.

That permits this sequence of events:

  • Declare a field x of type Animal.
  • Pass x as an out parameter to N.
  • N writes a Tiger into n, which is an alias for x.
  • On another thread, someone writes a Turtle into x.
  • N attempts to read the contents of n, and discovers a Turtle in what it thinks is a variable of type Mammal.

That scenario -- using multithreading to write into a variable that has been aliased -- is awful and you should never do it, but it is possible.

UPDATE: Commenter Pavel Minaev correctly notes that there is no need for multithreading to cause mayhem. We could replace that fourth step with

  • N makes a call to a method which directly or indirectly causes some code to write a Turtle into x.
  • Regardless of how the variable's contents might get altered, clearly we want to make the type system violation illegal.

    Conclusion 4: Out parameters cannot be made "larger".

    There is another argument which supports this conclusion: "out" and "ref" are actually exactly the same behind the scenes. The CLR only supports "ref"; "out" is just "ref" where the compiler enforces slightly different rules regarding when the variable in question is known to have been definitely assigned. That's why it is illegal to make method overloads that differ solely in out/ref-ness; the CLR cannot tell them apart! Therefore the rules for type safety for out have to be the same as for ref.

    Final conclusion: Neither ref nor out parameters may vary in type at the call site. To do otherwise is to break verifiable type safety.

    • Interesting post... it feels great to know why there are certain restrictions :)

      waiting for more on such obscure issues... Thanks :)

    • In languages like Ada, the out parameter is modified upon exit (in which case, the variable can never be read). There's a name for this convention, which I forgot, to distinguish it from call-by-reference.

    • > That scenario -- using multithreading to write into a variable that has been aliased -- is awful and you should never do it, but it is possible.

      It doesn't even have to involve multithreading, so far as I can tell. It can simply be that N called some other method M on the same thread, and M then modified x before returning - which is probably a much more likely occurrence.

      Reference-to-const in C++ has the same problems (and is more subtle in that, because it's implicit at call site, and otherwise is a heavily used parameter passing mode, unlike C#'s relatively rate ref/out) - if I recall correctly, it's part of the reason why all STL algorithms take begin/end iterators by value, for example - since if they were passed by reference, the function passed to the algorithm could e.g. change the value of the end-iterator in the middle of iteration, with unpredictable results.

      On an unrelated note, there seems to be something wrong going on with captchas and registered users. If I try to post messages to any MSDN blog - including yours - while logged in, they seem to go straight into the bit bucket, with no error messages. If I sign out and post them anonymously, the same messages are posted just fine. This seems to have started when captchas got introduced - previously, it all worked just fine.

    • Didn't you already discuss this about a month or two ago? I distinctly remember reading something about this exact topic...

      Reread the first sentence. Follow the link. -- Eric

      I also distinctly remember the first time I tried to enter this comment. Nothing happened :(

      Bummer! -- Eric

    • Pavel - M can only modify x to be a Mammal or a subtype of Mammal, so how is type safety lost?

      Nice one, Eric. Thinking about it from a different angle, the rules exist to prevent ref and out from changing the (static) type of the storage location itself i.e., N demands x to be of type Mammal whereas it's declared to be of type Animal. The rules still allow the runtime type to be different - the out parameter can obviously be set to a subtype of Mammal.

    • Senthil,

      I believe Pavel was talking about a scenario like the one below.  It is an example of the type violation that could occur if out parameters could be made "larger".

      class MyAnimal
      {
         public MyAnimal()
         {
             SetMammalAndMilk(out animal);
         }
         Animal animal;
         void SetMammalAndMilk(out Mammal mammal)
         {
             mammal = new Tiger();
             SetAnimalToTurtle();
             //mammal is now a turtle
             mammal.Milk(); // You can't milk a turtle
         }
         void SetAnimalToTurtle()
         {
             animal = new Turtle();
         }
      }

    • Yes, exactly. The real problem here is aliasing semantics of out-parameters (which means they aren't really "out" in the traditional sense used by Ada or COM/Corba, and more like "ref mustinit" - but we know that already).

    • "The rules are, briefly:

      1) N has to write to n before N returns normally. (If N throws, all bets are off.)"

      Surely, that cannot be true. If 'all bets are off', wouldn't that open up a security hole, where callers can access uninitialized memory:

      No. The jocular phrase "all bets are off" was not intended to convey that such a situation leads to undefined behaviour; rather, merely that we do not require an out parameter to be assigned on all possible code paths, only on "normal" code paths. -- Eric

         MyType x;
         try {
             MyFunction( out x); // throws before setting x, but the compiler cannot know that.
         } catch( Exception e) {
         // ignore
         }
         x.Something();

      The standard I can find is not clear on this...

      The C# standard is extremely clear on this matter. Which part of section 5.3.3.13 are you finding unclear?

      ...but I would guess that this should throw a NullReferenceException.

      Why guess when you can try it? Had you actually tried to compile your example, you would have discovered that your example is not a legal program fragment because there is a possible control flow in which you can access a local variable that has not been assigned a value. The variable is definitely assigned in all program text within the try block after the call, but not within the catch block or after the try block. -- Eric

    • 'Course if you really really wanted to...

      private void foo(ref object x)

      {

         x = (some condition) ? (object) new Mammal() : (object) new Reptile();

      }

    • Thanks for the corrections, Eric. I had not thought about the possibility that the compiler could reject that example. I  didn't try because a) I do not have a C# compiler at home and b) it is not possible to check a standard by running a compiler (although, as in this case, it can give hints about what is going on).

      BTW: for those trying to look up the hint: it took me a while to find that section 5.3.3.13. The Ecma standard I looked at (<http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-334.pdf>) does a +7 for (most?) chapter numbers, compared to <http://download.microsoft.com/download/3/8/8/388e7205-bc10-4226-b2a8-75351c669b09/csharp%20language%20specification.doc>. It has "Try-catch statements" in section 12.3.3.13.

    • What about "animals" being made larger?

      Mankind has crossed say a Zebra with a Donkey and made a Zonkey.

      (also known as zebrass, zebronkey, zeass, zeedonk, zedonk, zebadonk, zenkey, donbra, donbri, donkra, zebrinny, or deebra, zebrula) is a cross between a zebra and a donkey. The generic name for crosses between zebras and horses or asses is zebroid or zebra mule.

      See http://en.wikipedia.org/wiki/Zonkey

      I could also create a new control with a NEW method, does that make "CONTROL" or "CONTROLS" any bigger as I've made a new one?

      I believe the answer is no as any control inherits from the base System.Windows.Forms.Control or System.Windows.Forms.UserControl

      So this then raises my question;

      Why are EXTENSION METHODS allowed but not EXTENSION PROPERTIES?

      Good question. Long answer. I've been meaning to blog about that. -- Eric

      If you could extend a PROPERTY you could add say a new color ( or colour ) for an animal that may NOT be in a fixed SET of colors ( or colours ). That is like EXTENDING an enumeration of possible colors.

      Think of a car like say a new model FERRARI. For the first year it is only available in red, black or white.

      After the first year Ferrari ADD more colors like yellow, racing green

      and pink!! ( for the rich women , LOL ).

      I recently added an EXTENSION method for all controls called SHAPE to the base CONTROL class.

      This is so that you can make any control into say a hexagon like this.>>

      PictureBox1.Shape(6) 'VB.Net

      PictureBox1.Shape(6); \\Visual C# or C++  ??

      Why could I NOT add an EXTENSION PROPERTY to the CONTROL class for some set regular shapes such as; Triangle, Pentagon, Hexagon, Septagon, Octagon, Nonagon and a Decagon within an enumeration?

      It woud save adding the property to a new control.

      If you want to see the code see this thread.>>

      http://social.msdn.microsoft.com/Forums/en-US/vbgeneral/thread/0cab1cbd-553c-4ac2-97ec-334a4338484d

      Is this outside of OOP philosophy?

      Extension methods are often criticized as being "not OOP enough". This seems to me to be putting the cart in front of the horse. The purpose of OOP is to provide guidelines for the structuring of large software projects written by teams of people who do not need to know the internal details of each other's work in order to be productive. The purpose of C# is to be a useful programming language that enables our customers to be productive on our platforms. Clearly OOP is both useful and popular, and we've therefore tried to make it easy to program in an OOP style in C#. But the purpose of C# is not "to be an OOP language". We evaluate features based on whether they are useful to our customers, not based on whether they conform strictly to some abstract academic ideal of what makes a language object-oriented. We'll happily take ideas from oo, functional, procedural, imperative, declarative, whatever, so long as we can make a consistent, useful product that benefits our customers.

    • > Why are EXTENSION METHODS allowed but not EXTENSION PROPERTIES?

      There are a bunch of obvious potential problems with extension properties.

      For one, properties usually (though not always) imply state; so, to be most useful, extension properties would also require "extension fields" (or some other sort of state that can be associated with the object from "outside"). This has serious performance implications, and is pretty hard to come up with a single solution that is universally good enough for a wide range of scenarios. And for properties that only have a getter, a no-argument extension method is just as good.

      Another thing is that many useful extension methods are generic. Properties themselves cannot be generic. Of course, if an extension property isn't seen as a property on syntax level, it's another story - see below.

      Syntax would be a hazy area, too. An extension property cannot be a property (since it needs the explicit "this" argument... well ok, it could be an indexer property on CLR level, I guess). Try to come up with a syntax that would make sense under these constraints, and remember that it must also be possible to desugar extension property access (so that you can deal with those cases where you want an extension property, but the object shadows it with its member property). Of course, it's always possible to just fall back to plain methods decorated with some attribute, but that's not neat, and potentially confusing with extension methods.

      Some other things to consider:

      - Should it be possible to combine "extension getters" and "extension setters" in a single property? What if they have different types?

      - If a property already has a getter, is it legal to add an "extension setter", and vice versa? What if they have different types?

      - If a property has both getter and setter, but setter is private, should it still be possible to define an "extension setter", and use it without ambiguity from outside the class? What if the member setter is protected?

      - What if there are several "extension setters" of different types defined? Should this always be ambiguous at call site, or should it use overload resolution rules for the setters? What if there's also a member setter, of yet another different type - does it shadow all extension setters, or does it participate in overload resolution alongside them?

      I'm sure there are quite a few more there that I haven't thought of. As usual in language design (and design in general), the existence of those problems  seems extremely obvious once you know about them, but doesn't actually surface until you start considering all use cases for the feature in depth.

    • A good example of a language which tries "to be an OOP language" is Java - remember all the "delegates are evil, because they're not object oriented" talk in that camp back in the days of J++? - and look where they are now; still no first-class functions in the language nor any plans for them in the upcoming release, despite a heavy debate on the issue in the last 2 years or so.

      I'll take C#'s pragmatic hybrid approach over that every day.

    • @Pavel Minaev - Some excelent points on Extensions properties vs. methods. But I believe some to be incomplete (possibly because of brevity in a posting). Methods MAY imply additional state. And from a syntactical (but not metadata) standpoint a property named X of type T is equivilant to either (or both) methods of "void GetT(X x)" and "X GetT()". Since bothe methods could be created as extension methods, there should not be any fundamental (neglecting metadata) differences between declaring either (or both) of the extension methods and having a "extension property".

      If my analysis is correct. Then a potential approach would be to allow an extension construct where there was an explicit association of a property with a pair of methods....eith either hald of the pair is "nulll" or missing then it would be a read only or write only scenarion.

      Please feel free to contact me directly if you (or anyone else) wishes to discuss this matter....

      David V. Corbin [MVP]

      david (dot) corbin (at) dynconcepts (dot) com

    • @DrBlaise:

        public MyAnimal()

        {

            SetMammalAndMilk(out animal);

        }

      Animal animal;

        void SetMammalAndMilk(out Mammal mammal)

        {

           .........

           .........

        }

      Not every animal is  a mamal. Is the above a valid derivation relationship?

    Page 1 of 2 (19 items) 12