Finishing the Hat (^)

Once it became clear that support for .NET within C++ represented a distinct programming paradigm, it followed that the language needed to be extended to provide both a first class coding experience for the user, and an elegant design integration with the ISO C++ standard in order to respect the sensibility of the larger C++ community and engage their commitment and assistance. It also followed that the diminutive name of the original language design, The Managed Extensions for C++, had to be replaced as well.

    The flagship feature of .NET is the reference type, and its integration within the existing C++ language represented a proof of concept.  What were the general criteria? We needed a way to represent the .NET reference type that both set it apart and yet felt analogous to the existing type system. This would allow people to recognize the general category of form as familiar while also noting its unique features. The analogy is the introduction of the reference type by Stroustrup in the original invention of C++. So the general form becomes

 

      Type TypeModToken Id [ = init ];

 

where TypeModToken would be one of the recognized tokens of the language reused in a new context (again, similar to the introduction of the reference).

    This was surprisingly controversial at first, and still remains a sore point with some users – which, recall, is the motivation for this initial set of blog entries. The two most common initial responses I recall are (a) I can handle that with a typedef, wink, wink, and (b) it’s really not so bad. [The latter reminds me of my response to the use of the left and right shift operators for input and output in the iostream library.]

    The necessary behavioral characteristics are that it exhibit object semantics when operators are applied to it, something the original syntax was unable to support. I liked to call it a flexible reference, thinking in terms of its differences with the existing C++ reference [yes, the double use of the reference here – one referring to the .NET reference type and the other referring to the “it’s not a pointer, wink, wink” native C++ type – is unfortunate, much like the reuse of template in the Gang of Four Patterns book for one of my favorite design strategies.]:

 

1.      It would have to be able to refer to no object. The native reference, of course, cannot do that directly although people are always showing me a reference being initialized to a reinterpret-cast of a 0. [The conventional way to have a reference refer to no-object is to provide an explicit singleton representing by convention a null object which often serves as a default argument to a function parameter.]

2.      It would not require an initial value, but could begin life as referring to no object.

3.      It would be able to be reassigned to refer to another object.

4.      The assignment or initialization of one instance with another would exhibit  shallow copy by default.

 

    As a number of folks made clear to me, I was thinking of this puppy backwards. That is, I was referring to it by the qualities that distinguished it from the native reference, not by the qualities that distinguished it as a handle to a managed .NET reference type. We want to call the type a handle rather than a pointer or reference because both of these terms carry baggage from the native side. A handle is the preferred name because it is a pattern of encapsulation – someone named John Carolan first introduced me to this design under the lovely name of the Cheshire Cat since the substance of the object being manipulated can disappear out from under you without your knowledge. In this case, the disappearing act results from the potential relocation of reference types during a sweep of the garbage collector. What happens is that this relocation is transparently tracked by the runtime, and the handle is updated to correctly point to the new location. [This is actually a complicated functionality to provide in a static language like C++. Of course, on the other hand, it is expensive, and can be disrupted by an unconstrained poking into the managed heap. This is why the pointer concept doesn’t strongly translate into the .NET object model – at least at the user level.]

    So, the new reference type in the revised language design is referred to as a tracking handle, and exhibits the four qualities listed above. In the following three tracking handles declarations at global scope,

 

      Object^ obj; // a declaration of a tracking handle to a .NET Object

      Object^ poly = gcnew Foobar;

      Object^ obj2 = poly;

 

obj is a tracking handle of type Object that refers to no Object, and is by default set to null. In local scope, the equivalent declaration looks as follows:

 

            Object^ obj = nullptr; // local objects are not default initialized

 

poly is a tracking handle of type Object that is initialized to a FooBar object allocated on the managed heap. Because the language now supports two dynamic heap memories – the native heap, which is not garbage collected, and the managed heap, which is – a separate new expression is used to allocate memory from each. In the revised language design, a new keyword, gcnew, is added to allocate an object from the .NET managed heap. For example, here is an old and new allocation of a Systems::Windows::Forms::Button object:

 

Button __gc *button1 = __gc new Button(); // using the explicit form   

Button^ button1 = gcnew Button;

 

    This is admittedly not a compelling example for introducing a new keyword distinguishing where the memory allocation takes place. In this example, it is clear from the context that the allocation is of a .NET reference type and that it should take place on the managed heap. But the language designers have a deeper vision of type unification between the native and managed parts, and the introduction of gcnew facilitates that. You’ll have to trust me on this for now. [Note, by the way, that the tracking reference (^) modifier is not required following the new expression.]

    Finally, the initialization of obj2 tracking handle with poly does not result in a member-wise copy, as it would under the ISO C++ Object Model, but results in a shallow copy so that both obj2 and poly refer to the same FooBar object.

    The need for an explicit entity to indicate that a tracking reference refers to no object is a side-effect of the change in type representation. The initialization or assignment of 0 no longer indicates a null address. For example,

 

            obj = 0; // causes the implicit boxing of 0, not the assignment of obj to address no object

 

This raises a subtle issue with the porting of existing Managed Extensions for C++ code into the revised language design. For example, consider the following value class declaration:

 

// the original language syntax

__value struct Holder

{

            Holder(Continuation* c, Sexpr* v)

            {

                  cont = c;

                  value = v;

                  args = 0;

                  env = 0;

      }

private:

      Continuation* cont;

      Sexpr* value;

      Environment* env;

      Sexpr* args __gc [];

      };

 

Because both args and env are managed reference types, their initialization to 0 in the constructor cannot remain unchanged in the transition to the new syntax, but must be changed to nullptr [note that this translation is automated in a tool currently under development]:

 

// the revised language syntax

value struct Holder

{

      Holder( Continuation^ c, Sexpr^ v )

      {

            cont = c;

            value = v;

            args = nullptr;

            env = nullptr;

      }

 

private:

      Continuation^ cont;

      Sexpr^ value;

      Environment^ env;

      array<Sexpr^>^ args;

};

 

Similarly, tests against those members comparing them to zero must also be changed to nullptr. Here is the original syntax,

 

// the original language syntax

Sexpr* Loop (Sexpr* input)

            {

                  value = 0;

                  Holder holder = Interpret(this, input, env);

                  while (holder.cont != 0)

                  {

                        if (holder.env != 0)

                        {

                              holder = Interpret(holder.cont, holder.value, holder.env);

                        }

                        else if (holder.args != 0)

                        {

                              holder = holder.value->closure()->apply(holder.cont, holder.args);

                        }

                  }

                  return value;

            }

 

And here is the translation into the new syntax, again generated automatically by a translation tool under development within our group:

 

      // the new revised syntax

Sexpr^ Loop ( Sexpr^ input )

      {

            value = nullptr;

            Holder holder = Interpret( this, input, env );

            while ( holder.cont != nullptr )

            {

                  if ( holder.env != nullptr )

                  {

                        holder = Interpret( holder.cont, holder.value, holder.env );

                  }

                  else

                  if ( holder.args != nullptr )

                  {

                        holder = holder.value->closure()->apply( holder.cont, holder.args );

                  }

            }

 

            return value;

      }

 

    So, the final item I wish to mention about the new tracking reference syntax is the member selection operator. To me, it seemed like a no-brainer to use the object syntax (.). Others felt the pointer syntax (->) was equally obvious, and we argued our position from different facets of a tracking reference’s usage:

 

// the pointer no-brainer

T^ p = gcnew T;

 

// the object no-brainer

T^ c = a + b;

 

    So, as with light in physics, a tracking reference behaves in certain program contexts like an object and in other situations like a pointer. The member selection operator that is used is that of the arrow, as in the original language design.

    In the next series of entries, I will walk through the changes in the language design, contrasting the original and revised language support for the various .NET features.

 

 

disclaimer: This posting is provided "AS IS" with no warranties, and confers no rights.