Probably the most conspicuous and eyebrow-lifting change between the original and revised design of the dynamic programming support within C++ is the change in the declaration of a .NET reference type:


// original language

Object * obj = 0;


// revisied language

Object ^ obj = nullptr;


There is actually a great deal of significance to that change - not enough to justify a book, certainly, but more than enough to warrant a first entry in a blog. So, here we go.


There are two primary questions that get asked when people see this? Why the hat (as the ^ is called affectionately along the corridors here within Microsoft), but, more fundamentally, why any new syntax at all? Why couldn't the original language design be cleaned up with less invasiveness rather than the admittedly in-your-face strangeness of the revised language design, various aspects of which will be the topic of this blog.


So, the original design challenge of supporting .NET within C++ is that the object model of the two are very, very, well, really very different. Really. It is kind of like comparing Kansas with Oz. I suppose I need to support that, even in something as informal as a blog. Ok, let's see what I can do here.


C++ is built upon a machine-oriented systems view. Although it supports a high-level type system, there is always an escape mechanism, and those mechanisms always lead down into the bowels of the machine - you cast away type, you suppress the virtual mechanism, you turn names into relatively fixed machine addresses. Think Neo and the Matrix, if you are not into Dorothy and find Kansas a bit too, well, native. C++ supports a static (that is, compile-time) physics - even when we push the envelope and poke into the run-time, it is all strictly set up at compile-time: not just the virtual mechanism, but the exception-handling model as well. When push comes to shove, and the user is hard-pressed to pull a rabbit out of the hat, she tunnels under the program abstractions, picking apart types into addresses and offsets. (Think of Luke turning the computer-aided target system off in the original Star Wars and trusting to the force - that's your essential C programmer, and those that don't believe, can't believe you're really doing that. Don't you realize the Universe is at stake?)


.NET is built atop a software component layer, in this case called the Common Language Runtime (CLR). This is a fundamental design solution for really difficult problems. Back when I began programming, the best advice I received - this was back at Bell Labs - was, if you don't know how to solve it, add a pointer - that is, add a level of indirection, and that will give you enough slack to do something smart. Well, that was . gosh, that was a long time ago. (It was about the same time that making overheads for a talk meant manually attaching blank acetate to a printed page of text and hand-feeding it through a toaster that burned the text onto the acetate which you caught before it fell to the ground, separated, then began the process over again. Ok. I digress.)


The C++ equivalent of adding a pointer is defining a class - that is, adding a layer of abstraction. This is what the stl does with iterators in order to transparently support walking across contiguous vectors and non-contiguous lists. The trade-off is two-fold: an additional layer of complexity, and a possible loss of efficiency are the usual suspects in terms of downside. The upside is increased flexibility and, at times, elegance bordering on beauty. This is what .NET is about, but in this case, the level of abstraction is a runtime software level. It supports a dynamic, run-time physics. When push comes to shove, the user reflects upon the execution environment, querying, coding, and creating objects literally out of thin air. Instead of tunneling, one soars, but the experience can be unsettling to those use to having both feet on the ground.


If we give the C++ static object model the code-name Kansas , and the .NET dynamic object-model the code-name Oz, then the work of the language design is to bridge these two worlds both within the C++ language and within the C++ community at large. Neither of these were easy tasks.


Let me engage in a bit of poetic license for a moment [I'm permitted that; I have an MFA in writing packed in a box somewhere in a garage or basement here in the Northwest, or back in Los Angeles, or even further back in New York City - this is called a biographical aside [I've been told by Martyn Lovell that these are permitted in blogs [but only with three levels of parenthetical nesting!]!]!]. So, I'm claiming in a fit of poetical musing, that C++ and .NET represent, respectively, the physics of Kansas and Oz. For example, you can get from here to there under the Oz semantics by clasping ones hands together, squeezing one eyes tightly shut (no peeking, please, all addresses are encapsulated), and just wishing. And poof. You are transported and all references to your location are updated without you having to do a thing. In Kansas , however, either you use public transportation, your own vehicle, or you cast all that aside, and you walk. If you don't physically update everyone with your new location, you will be lost to them. That's how it works down here in Kansas .


  • So, the question becomes, how do we get our Dorothy to become a successful resident of Oz while not offending her Kansas sensibilities?


  • Or, alternatively, how can we allow her to move between the two worlds, rather than feeling each to be the shadowy dream world of the other?


[End of poetic aside. You can start reading again.]


Let's consider this from a more prosaic viewpoint. [After all, an MFA, while a mighty fine achievement, is a terminal degree. Trust me on that.] So, what does it mean when we write the following?


            T t;


Well, in ISO C++, regardless of the nature of T, we are certain of the following characteristics: (1) there is a compile-time memory commitment of bytes associated with t equal to sizeof(T) , (2) this memory associated with t is independent of all other objects within the program during the extent of t. (3) The memory directly holds the state/values associated with t, and (4) this memory and state persists for the extent of t.


What are some of the consequences of these characteristics. Item (1) tells us that t cannot be polymorphic. That is, a polymorphic type cannot have a compile-time memory commitment except in the trivial case in which derived instances do not impose additional memory requirements. This is true regardless of whether T is a primitive type or serves as a base class to a complex hierarchy. A polymorphic type in C++ only is possible when the type is qualified with a token modifying its meaning - either T* representing a pointer, or T& representing a reference - such that it the object of the declaration only indirectly refers to an object of T. (Those hostile to C++ finds this a huge laughing point, gleefully pointing out the naïve slicing errors that occur when an object (that is, an unqualified T) is attempted to be used as a polymorphic target of an assignment or initialization. [This observation is actually not an aside, but a set-up for motivating the introduction of the hat (^) notation.)


This separation of value and reference within a single notational type system was a deliberate design decision by Bjarne Stroustrup in the late 1970s based on his doctoral experience at Cambridge with using Simula-68, in which all objects are allocated on the runtime heap  and all object access is indirect through a transparent handle. At the time, the Simula-68 object model provided prohibitely expensive for the then current machines and resource availability. [By prohibitively expensive I mean that the required work could not be carried out in the time allotted.]


[A brief digression on pointers and references that will later prove to be relevant to the introduction of the hat (^) syntax for .NET reference types]


To delay resource commitment until run-time, two forms of indirection are explicitly supported in C++:


  1. Pointers: T *pt = 0;
  2. References: T &rt = *pt; // oh, well .


Neither form is well-behaved under the model supported by traditional 00 languages (again, to their followers great amusement).


Pointers conform to the C++ Object Model. In


            T *pt = 0;


pt directly holds a value of type size_t that is of fixed size and extent. Lexical cues are used to toggle between the direct use of the pointer and the indirect use of the object addressed. It can be unclear at times which mode applies to what and when or how (and is a form of celebrated obscuritanism etched within code as a kind of tattoo proving mental toughness): *pt++;


References provide a syntactic relief from the seeming lexical complexity of pointers while retaining their efficiency:


Matrix operator+( const Matrix&, const Matrix& );

      Matrix m3 = m1 + m2;


References do not toggle between a direct and an indirect mode; rather they phase-shift between the two: (a) at initialization, they are directly manipulated, but (b) on all subsequent uses, they are transparent.


In a sense, a reference represents a quantum anomaly in the physics of the C++ Object Model: (a) they take up space but, except for temporary objects, they are immaterial, (b) they exhibit deep copy on assignment and shallow copy on initialization, and (c) unlike const, they really are immutable. While they are not all that useful within ISO C++, except as function parameters, they turn out to be the inspirational pivot upon which the language revision turns.



The C++.NET Design Challenge


Literally, for every aspect of the C++ extensions to support .NET the question always reduces to "How do we integrate this (or that) aspect of the Common Language Runtime (CLR) into C++ so that it (a) feels natural to the C++ programmer, and (b) is easy to use in its own right under .NET. I like to call this the Janus face dilemma. (Janus is a two-faced Roman diety, the one turned facing towards what has just been, the other towards what is to be.)



The Reader Language Design Challenge


So, to give you a flavor of the process, here is the challenge: How should we declare and use a .NET reference type? It differs significantly from the C++ Object Model: different memory model (garbage collected), different copy semantics (shallow copy), different inheritance models (monolithic, rooted to Object, supporting single inheritance only with additional support for Interfaces).


So, in the traditional cliff-hanger whiz-bang multi-part installment, I leave it to you, until Part 2 [after the Turkey roosts] to think about, ok, just how will we integrate support for the .NET reference type within ISO C++. [Hint: Version 1 chose to represent it as a pointer. The general concensus is that it doesn't offer a first-class programming experience.] 



disclaimer: This posting is provided "AS IS" with no warranties, and confers no rights.