Stan Lippman's BLog

C++/CLI

  • The Name Return Value Optimization

    I received the following mail the other day in response to my writing that the Visual C++ compiler has [finally!] implemented the name return value optimization (NRVO) in the Visual Studio 2003 release:

     

    Sender: James Slaughter

     

    I was in the process of catching up with your blog entries … when I spotted your claim that VC 7.1 implements the named return value optimization.I've so far failed to write code that takes advantage of NRVO, with any optimisation settings that I know of. Are you mistaken, or have I overlooked something? I realise that this could be a breaking change for badly written code, so is there an un(der)-documented compiler switch?

     

    The fact that James has to ask -- that there is no source level way to be certain whether the optimization is or is not being applied -- underscores what I believe is a fundamental mistake we made in its provision, and why I would urge the new ISO C++ committee to revisit the whole issue. Let me give some background.

     

    Given a function that returns a class object by value, such as the Matrix addition operator:

    Matrix operator+( const Matrix &m1, const Matrix &m2)

    {

    Matrix sum;

    // default Matrix constructor applied

     

    // do the math

     

    return sum;

    }

    the function is generally transformed internally into

    // Psuedo C++ Code:

    // internal transformation of function

     

    void operator+( Matrix &_retvalue,

                    const Matrix &m1, const Matrix &m2)

    {

    Matrix sum;

     

    // invoke default constructor

    sum.Matrix::Matrix();

     

    // do the math

     

    // copy construct sum into return value

    _retvalue.Matrix::Matrix( sum );

    }

    The complaint with this code, by inspection, is the realization that the final copy construction of the local Matrix object is unnecessary. It would be more efficient to directly compute the result into the internally generated return value.

     

    This inability of the programmer to return a class object efficiently was considered a weakness of the C++ language. One proposed solution, by Michael Tiemann, was a language extension to name the function’s returning class object. For example,

    Matrix

    operator+( const Matrix& m1, const Matrix& m2 )

    name result

    {

                // no longer write: Matrix result;

                //...

                // no longer write: return result;

    }

    The compiler would then rewrite the function internally to take a third reference parameter:

    // internally rewritten function

    // under proposed language extension

    void

    operator+( Matrix &result,

               const Matrix& m1, const Matrix& m2 )

    {

                // directly compute into result

    }

    and transform all uses of the function to compute the result directly in the target class object. For example,

    Matrix c = a + b;

    becomes transformed internally into

    Matrix c;

    operator+( c, a, b );

    Michael made the unsupported claim in his presentation of this language extension that the compiler, without this explicit syntactic cookie, could not consistently carry out this optimization. Others in the community took exception to this, and claimed that the optimization could be provided without the necessity of extending the language. [This was discussed at some length in the commentary of Section 12 of Stroustrup & Ellis’ Annotated Reference Manual. Walter Bright was the first to implement the NRVO, in his Zortech C++ compiler, without Michael’s extension. Subsequently, Nancy Wilkinson implemented it in a next release of cfront, and these were taken as demonstrations that a language extension was not needed.

     

    This name return value extension never became part of the language — but the optimization did. It was realized that a compiler could recognize the return of the class object and provide the return value transformation without requiring an explicit language extension. That is, if all the exit points of a function return the same named object. For example, given a function of the general form

    classType

    functionName( paramList )

    {

                classType localResult;

                // do the work ...

                return localResult;

    }

    the compiler transforms both the function and all its uses internally to the form

    void functionName( classType &namedResult, paramList )

    {

          // still must construct result object

          // replacing the call on the local object …

          namedResult.classType();

     

    // directly compute into namedResult

    }

    eliminating both the return by value of the class object and the need to invoke the class copy constructor. To trigger it, the class object returned must be the same named object at each return point within the function. To give a sense of the performance gain, here is a measure I made a bit of a long time ago on a piece of code that heavily returned objects [timing is based on the UNIX timex command],

       Not Applied      Applied  Applied + -O

            1:48.52        46.73          46.05

     

    As you can see, what’s right with the optimization is the performance gain with little or no additional work. What’s wrong with the optimization is that you don’t know if it will or will not be triggered, and this is a real problem. For example, under cfront, it was only triggered at the top level of the local function. If you nested it within a local block, it was not generated; this surprised people. What surprised people even more was that, up until recently, the two most widely used compilers, Visual C++ and GNU C++, did not provide it. I have been told that Visual C++ 7.1 does implement it, but James cannot verify that by an inspection of the source level language or of compiler itself, apparently.

    There are other side-effects as to whether this optimization is applied or not: you cannot inspect your source-level code and know whether there is a balance between the copy constructor and destructor of your class; therefore, any programming that hand-shakes between the two may or may not be semantically valid depending on whether the optimization is or is not turned on – this, I presume, is what James meant when he said,

    I realise that this could be a breaking change for badly written code

    What happened, I suspect, is that everyone got caught up with Michael’s assertion that the extension was required in order for the compiler to do its work. This seems to have been overstated and, once it was demonstrated not to be the case, the question of whether the extension was necessary was dropped. And yet it seems to me that the extension is necessary – but not for the reason Michael put forward. Rather, it is necessary in order to know at the source level whether or not the optimization is being applied.

    Michael’s solution is probably not sufficient, however, because there are cases in which one which to elide the copy constructor but which one cannot pin the name to, such as

                Matrix mat = Matrix( 4, 4 );

    In this case, no one wants a temporary 4x4 Matrix constructed and then copy-constructed into mat, but Michael’s syntax doesn’t necessarily cover this case. One possibility is to have the class designer place it on the copy constructor itself.

    This, by the way, is the reason that an initialization statement is able to eliminate a temporary object while an assignment cannot. For example, when we write,

    Matrix composite = m1 + m2;

    the compiler must internally transform this to match the transformation of the Matrix addition operator. One possible transformation is the following:

    // One possible internal transformation

    // Psuedo C++ Code

     

    Matrix temp;

    operator+( temp, m1, m2 );

    Matrix composite( temp );

    temp.Matrix::~Matrix();


    In order for this to work, of course, the definition of

    Matrix temp; // no associated constructor call

    must not be accompanied by an invocation of the default constructor. This is necessary, remember, because temp is constructed within the addition operator and we cannot initialize an object twice. temp, that is, must represent raw storage. While this transformation works, it can be considerably improved when the compiler recognizes that composite also represents raw storage. Rather than generating a temporary to copy construct into composite, it can pass composite directly into the addition operator:

    // A more aggressive internal transformation

    // Psuedo C++ Code

     

    Matrix composite;

    operator+( composite, m1, m2 );

    Again, this is possible because composite, at this point, represents raw storage. (The internal definition of composite, above, only allocates storage; the compiler suppresses an invocation of the default Matrix constructor, the same as it did when defining temp.) The assignment is a totally different puppy:

    composite = m1 + m2;

    composite, at this point, represents initialized storage. It cannot be directly passed into the addition operator, which would result in a second initialization. Rather, the compiler must generate raw storage, then copy assign the result to composite:

    // internal transformation of assignment

    // Psuedo C++ Code

     

    Matrix temp;

    operator+( temp, m1, m2 );

    composite.Matrix::operator=( temp );

    temp.Matrix::~Matrix();

    Because the C programmer by the C declaration syntax must always, within the C language, write

    Matrix composite;

    composite = m1 + m2;

     

    and typically carries that habit into C++, the C programmer tends to write code that is inexplicably inefficient while on the surface perfectly correct.

    So, how can we help James discover whether Visual C++, or any C++ compiler that he uses, implements the NRVO? Well, you need to write a piece of code in which the copy constructor is either present [indicating the absence of NRVO] or it is elided [indicating the NRVO having been applied]. One can either do this by executing the instrumented copy constructor [here I am, or let its silence be deafening], or by examining the generated assembly. For example,

    #include <iostream>

    using namespace std;

     

    class foo {

    public:

          foo() { cout << "foo::foo()\n"; }

          foo( const foo& ){ cout << "foo::foo( const foo& )\n"; }

          ~foo(){ cout << "foo::~foo()\n"; }

    };

     

    foo bar(){ foo local_foo; return local_foo; }

     

    int main()

    {

          foo f = bar();

    }

     

    When I execute this, without any special settings, under either Debug or Release, using the Visual Studio IDE, the output is

     

                foo::foo()

    foo::foo( const foo& )

    foo::~foo()

    foo::~foo()

     

    and this indicates that the NRVO is not being turned on by default, which is the only behavior that I would expect as a programmer. So, this helps underscore my belief that the ISO committee must revisit this decision.

  • Comment Response: Finding a Voice for a Blog

    A reader commented on my first blog entry, and my first writing about C++/CLI in over a year, stating the following

     

    RE: The Revised C++ Language Design Supporting .NET -- Part 1 12/2/2003 4:15 PM Tim Sweeney

    You poor, confused soul.

    This is actually a variation of Buzz Lightyear’s response to Woody in the opening of Pixar’s original feature film Toy Story. Woody has just tried to explain to Buzz that Buzz is really a toy and not really the star fleet commander whom he believes himself to be. Of course, for Woody, Buzz was deluded rather than simply confused.

     

    Shortly upon my joining Disney Feature Animation, a number of us flew up to Pixar to meet with them. I had been asked to develop a language along the lines of Pixar’s ML (Modeling Language) language which they had described in a recent [or not so recent] Siggraph paper. When Disney Feature Animation first made use of CGI (Computer Graphic Imagery) in The Great Mouse Detective back in the 1980s, Scott Johnson and others there at Disney invented a language they called EEK! [This is the sound people tend to make on first seeing a mouse irrespective if it is wearing white gloves and a blue hat with stars.]  This language was having difficulty scaling up to the success it paved the way for – that is, for the amount of CGI that was then being integrated within each cartoon [the idea was that in a 2D animation, the CGI elements should not call attention to themselves and so anything too spectacular was always muted down until it cease to call attention to itself]. One of the cool things up at Pixar was a screening of about 7 minutes of composited film – this was the toy soldier sequence where the soldiers are deployed to spy out the opening of the presents. A purely digital film was completely uncharted territory back then, and no one was quite sure if a main street audience would sit through 80 minutes of specular plastic however artfully rendered.

     

    The purpose of the initial blog was three-fold. On the surface, it is intended to motivate our redesign of what came to be called the Managed Extensions for C++ released with Visual Studio.NET. That is what the content is driving towards. Secondly, it was to put myself on the public blog because my name is recognized and would draw attention to our efforts with the revised language. That is, irrespective of the content, folks recognizing my name and curious would tune in. This is good. So, those are the corporate reasons for me to write a blog – part of why I’m on salary. Next comes the personal part. I have a good idea what I want to say, and a general sense of having a right to say it without being apologetic. So, the only degree of freedom left me is how it should be said. One might call that finding one’s voice. I stole shtick from the novelist Willam H. Gass’ experimental novella Willie Master’s Lonesome Wife [I am sure I damaged the title [much like I damaged Peter Jackson’s name [calling him Peter Jacobson [I was thinking of Ivar Jacobson, the methodologist [although I knew his first name was Peter [and my off-by-one error with Simila-67]]]]]] and the silly nesting of parentheses from John Barth from his text, Chimera  [John allowed me as an undergraduate to attend his graduate fiction writing seminar [and so doing this sort of nonsense is an inside joke that only I appreciate]]. Certainly, Mr. Tim Sweeney did not appreciate it. And it didn’t feel right anyway; it was intended just to be an old man’s sloshing around with language.

     

    If you write books for a reputable publisher, they will be peer reviewed during the process of composition. Unfortunately, those peers are not always thoughtful or particularly gentle in their responses. I remember one reviewer of the first edition of my C++ Primer cursing me and darning the book to heck, and my only consolation was that he then complained that Kernighan and Ritchie were guilty of the same sins. And I thought, great; those sins I could live with. However, every response, however negative, is useful in the review process, because it is another data point. Mr. Sweeney reinforced my sense that if I really want to play literary games, I should write some literature, and in general keep things separate – this is a lesson Huck Finn’s aunt fails to teach him in the opening of what Hemingway called the first American novel, but I think I’ve taken it to heart, finally. I think it marred my third edition of the Primer and my C# Primer. The problem is that I would rather write fiction but felt I could not afford to, and so it seeped into inappropriate texts.

     

    I use to have a public web site on which I keep old articles, alternative versions of chapters of my books, editorials from the C++ Report, source code, and other miscellaneous data and interfaces. I added a guest book at one point to allow users to sign in. I was aghast one day to discover that a pornographic site had decided to use my guest book as an advertising strategy, and startled I just nuked the whole thing. In some unfair way, I feel the same about Mr. Sweeney.

     

     

  • Making a Virtual Table Context-Sensitive

    In part 1 of this discussion [the January 28th Blog entry], I pointed out a different behavior between the C++ and CLR Object Models with regard the identity of a derived polymorphic object during the construction of its base class sub-objects. Under the CLR object model, the embryonic derived object is always treated as an instance of its class within each base class constructor; therefore, any virtual function invocations within the call chain initiated within the base class constructor resolves to the derived class instance. Under ISO C++, the embryonic derived object is in turn treated as an instance of its base class sub-object currently under construction; therefore, each virtual function invocation within the call chain initiated within the base class constructor resolves to the instance active for an object of that base class and not the instance active for the derived class. I illustrated this as follows:

     

    creating a native foobar : bar : foo {} instance ...

    inside nat_foo ctor: i am talking as a nat_foo ...

    inside nat_bar ctor: i am talking as a nat_bar ...

    inside nat_foobar ctor: i am talking as a nat_foobar ...

     

    creating a managed foobar : bar : foo {} instance ...

    inside foo ctor: i am talking as a managed foobar ...

    inside bar ctor: i am talking as a managed foobar ...

    inside foobar ctor: i am talking as a managed foobar ...

     

    and pointed out that were the derived instance of the virtual function to access a state member or resource of the derived class object, the results would be undefined since the derived class object’s state members and resources have yet to be constructed.

     

    There is run-time overhead in the maintenance of the C++ Object Model semantics, and I thought I would explore that in this follow-up entry. There are two problem cases to solve: (1) the direct invocation of a virtual function within the constructor [easy], and (2) the indirect invocation of a virtual function through an arbitrary call chain [hard]. The first case is easy because we can always resolve it statically: that is, within the nat_foo constructor, always resolve an invocation of the virtual talk_to_me() instance to that defined within the nat_foo class. So it is the second case only that warrants further discussion.

     

    Consider the following class hierarchy:

     

                class Base {

    public:

          virtual void bar();

    void foo() { bar(); }

     

    Base() { foo(); } // 1

    };

     

    class Derived : public Base {

          public:

                virtual void bar();  // overrides Base::foo()

               

                Derived(){ foo(); } // 2

                // …

          };

     

          int main()

    {

    Base *pbase = new Derived; // invokes (1) and then (2)

    pbase->foo(); // 3

    }

     

    The behavior that the ISO C++ compiler has to support is context sensitive. The invocation of foo() at all times is resolved to Base::foo(), which is a non-virtual inline method invoking the virtual bar() defined in both Base and Derived. The actual of bar() within foo() is determined by the implicit this pointer, so you can imagine the call being expanded to look something like this:

     

    this->bar();

     

    where the actual type of the this pointer at each call point determines which instance of bar() is invoked. Base::foo() is invoked three times, marked in the program by // 1, // 2, and // 3, and result in the following instances of bar():

     

    1. The Base constructor is invoked to initialize the sub-object of the Derived class object allocated in the first line of main(). It invokes foo() which in turn resolves to the Base::bar() virtual method being invoked.

     

    1. The Derived constructor is invoked subsequent to the completion of the Base constructor, and completes the initialization of the Derived class object allocated in the first line of main(). It invokes foo() as well which in turn resolves to the Derived::bar() virtual method.

     

    1. The invocation of foo() directly within the second line of main() through the polymorphic pbase object results in the invocation of Derived::bar().

     

    So, the implementation difficulty is that within foo() we only want to suppress the virtual mechanism if foo() is invoked during the execution of a constructor. In this special case, we always invoke the instance of bar() associated with the class of the executing constructor and not for the class of the type under construction. [That is, when foo() is invoked within the Base constructor, the Base::bar instance should always be invoked. Similarly, when foo() is invoked within the Derived constructor, the Derived::bar instance should always be invoked.] Otherwise, the regular virtual function call resolution should kick in. [That is, within main(), the instance of bar() being invoked is determined by the type of object addressed by pbase.]

     

    So, how might we do that? One possible solution is to introduce a global entity which either points to the class whose constructor is being executed or is set to 0. foo() is then rewritten by the compiler to test the entity and, if non-zero, to invoke the instance associated with the pointed to class. If the entity is 0, the call would then go through the normal virtual mechanism. This would have a rather significant impact on the program and, in fact, is not a preferred solution.

     

    So, what can we do? Well, rather than trying to sensitive each and every call point, why not try to sensitive the virtual mechanism itself to be aware of whether or not the call is originating from within a constructor. How might we do that?

    Consider for a moment what actually determines the virtual function set active for a class:  the virtual table.  How is that virtual table accessed, and therefore the active set of virtual functions determined?  By the address to which the virtual table pointer [vptr] within the polymorphic object is set. To control the set of active functions for a class, therefore, the compilation system need simply control the initialization and setting of the vptr.  (It is the compiler’s responsibility to set the vptr, of course, not something the programmer need or should worry about.) 

    The solution is to set the vptr within each class constructor after invocation of its base class constructors, but prior to the execution of user provided code or the expansion of data members initialized within the member initialization list. In this way, within each base class constructor throughout the class hierarchy, the derived class object under construction literally becomes an object of the class for the duration of the base class constructor. This is how the Derived class object becomes a Base class object within the Base class constructor and back to a Derived class object within its own constructor. Within each base class constructor, it is indistinguishable from a complete object of the constructor’s class.  For derived class objects, as I suggested in the part 1, ontogeny recapitulates phylogeny. 

    The general algorithm of constructor execution is as follows:

    1.  Within the derived class constructor, all virtual base class then immediate base class constructors are invoked.

    2.  That done, the object’s vptr(s) are initialized to address the associated virtual table(s).

    3. The member initialization list, if present, is expanded within the body of the constructor.  This must be done after the vptr is set in case a virtual member function is called.

    4. The explicit user-supplied code is executed.

    The vptr always has to be set within the constructor of the actual class of the object being initialized, so that in itself is a necessary overhead of the virtual mechanism. This is why bitwise copy semantics or the use of memset/memcpy is not permissible for classes in which a vptr is present. The additional runtime overhead in support of this type evolution within sub-object construction is the resetting of the vptr within each base class constructor as well. Again, the purpose of this is to reassign the active virtual table for virtual method resolution within the execution of each base class constructor. This is the implementation variation within the two object models under discussion.

  • Smalltalk and C++ -- a Reader Comment and Response

    A reader, Chris Hanson, writes:

     

    re: A Fundamental Difference in Class Behavior between the Native and Managed Object Model

     

    In other words, Managed C++ actually behaves like an object-oriented language, such as Smalltalk or Objective-C.  An instance of a class is an instance of that class, no matter where in the hierarchy a particular method is located.

     

    Hmm. While there is very little content to respond to in this comment and a great deal of attitude, it does reflect an interesting sentiment in the `pure’ OO community that has never been terribly pleased with the hybrid C++ language. Let’s see what if anything of interest I can make out in response to this comment. [I make no guarantees.]

     

    Is Smalltalk the first object-oriented language and the measure of all things OO? No. Before Smalltalk, there was Simula-68, which Bjarne Stroustrup used as a doctoral student at Cambridge back in the 1970s, and which as a student in the Columbia University computer science department I used in Programming Language Concepts, which also covered Algol, Algol68, Lisp, and Prolog – Ada was still a ROYGBIV candidate competition, and C was mentioned simply to contrast its parameter passing semantics with that of Algol. Smalltalk is a branch in a much larger tree.

     

    Stroustrup’s use of Simula at Cambridge was a mixed bag – I don’t mean to speak for Stroustrup, but use his experience as I remember hearing of it to score a contrast between the Smalltalk school of OO as championed by our Mr. Hanson, and the school represented by C++. While Simula modeled the distributed OS in most excellent fashion; its execution was dismal. Stroustrup exhausted his budgeted cpu-time before his program had generated sufficient data. Casting about for some alternative computing resource, he made use of a corner machine using the BCPL corner language. [Basic Computer Programming Language, the predecessor to Thompson’s B and Ritchie’s C languages. The arcane question we posed to new hires at Bell Labs, before C++, of course, was: given B, and then C, what should the successor language to C be called? Of course, to Walter Bright’s chagrin, it should be called P, not D.] This required hand-translating all the nifty class abstractions to the level of bytes and words, but … well, you know the rest.

     

    Smalltalk is a dynamic language making extensive use of the run-time, and its performance reflects that. However, it provides considerable more flexible than C++. Stroustrup constrained the C++ Object Model by essentially curbing its reliance on run-time services. What he sacrificed in flexibility he made up with its extraordinary gains in performance. C++ is used in real-time virtual reality simulations [I’m familiar with the brilliant Aladdin ride developed within Disney Imagineering, for example] and its uses in computer graphics imagery and in aviation.

     

    So, Smalltalk and C++ are two very different languages attempting to solve very different problems. Smalltalk, while visionary, was not widely used, and what use it had has been displaced by the C++-influenced Java. C++ use is pervasive: the C# compiler released under Visual Studio, for example, is implemented in C++. The three major operating systems, Windows, Linux and the Macintosh X, are all implemented in a mixture of C and C++. The major modeling and animation software that fuels the film industry and brings us wonders such as The Lord of the Rings is all powered by C/C++. That tickles me no end. C++ has brought class to C, and it underpins our entire sophisticated software infrastructure.

     

    So C++ has been wildly successful both in its usage and, interestingly, in its influence: Java and C# are refinements of C++ that attempt to put back [some] of the dynamic nature of the Smalltalk vision. Those of us involved in the definition of C++/CLI believe we have done a better job of it then either of those two languages, but what else would you expect us to say? Better to say, rather, that it is our intent to have C++/CLI go C# one better. There’s nothing negative in that. Hopefully that represents a constructive competition that will benefit both languages and their user communities.

     

    I have in the past joked that .NET is Smalltalk’s revenge, and Mr. Hanson’s comment, I believe, bears some small witness to the truth in that. To that degree, Visual C++ is the only language that integrates the performant hybrid OO of ISO C++ with the dynamic OO purity of C++/CLI. My point in the blog was to bring attention to the difference and mildly suggest the non-intuitive nature of the pure OO model -- because before an object is initialized, contrary to Mr. Hanson's assertion, it does not exhibit the state or behavior of an entity of that class. The sub-object partial construction of a class entity is a singularity that falls outside Mr Hanson's claim that “an instance of a class is an instance of that class“.

     

     

  • A Fundamental Difference in Class Behavior between the Native and Managed Object Model

    I have used two primary metaphors in my discussions of bring C++ to the .NET platform. In terms of the different physics of the two object models, I have distinguished them as Kansas and Oz, and claimed with little apparent success that the physics of the two are very different. But the biology of the two object models is also different. In native C++ [hold on to any loose items now – this ride may be bumpy!] ontogeny recapitulates phylogeny.

     

    Say what?

     

    Let me present the definitions necessary, and then the proof.

     

    ·         Ontogeny: The origin and development of an individual organism from embryo to adult. In our case, the construction of an individual object of a derived class.

    ·         Phylogeny: The evolutionary development and history of a species or higher taxonomic grouping of organisms. In our case, the inheritance hierarchy of the class.

    ·         Ontogeny recapitulates phylogeny: The development of the individual organism repeats the adult stage of its ancestors.

    In our case, the development of the individual object repeats the adult stage of each of its ancestor classes. I know, you’re saying, Lippman, is this really necessary? In a strange way, I am going to claim that it is. Bear with me.

    Consider the following native C++ class hierarchy:

    class nat_foo {};

    class nat_bar : public nat_foo {};

    class nat_foobar : public nat_bar {};

     

    What do we know the order of constructor calls when we create either a nat_bar or nat_foobar object? The invocation of constructors is to invoke the base class constructor prior to the invocation of the derived class constructor. Thus, for a nat_foobar object, the order of constructor execution is: nat_foo, nat_bar, and nat_foobar. How is this achieved? The compiler inserts a call to the base class constructor prior to the execution of the program statements of the derived class constructor. So, for example, if nat_foobar’s constructor looked as follows:

     

          nat_foobar::nat_foobar() { cerr << "inside nat_foobar ctor: "; talk_to_me(); }

     

    the revised internal body of the constructor would generally look as follows:

     

                {

    // inserted call to the base class constructor

    this->nat_bar();

     

    // user-supplied program code within the constructor …

    cerr << "inside nat_foobar ctor: ";

    this->talk_to_me();

    }

     

    Now, let’s make things interesting. talk_to_me() is a virtual function. Here is what it looks like for our nat_foobar class:

     

                virtual void talk_to_me() { cerr << "i am talking as a nat_foobar ... \n"; }

     

    So, all I’ve done is thread up a class hierarchy in which each constructor announces its invocation, calls its virtual talk_to_me() method, and exits. The talk_to_me() method simply announces to which class it belongs. The three classes of the hierarchy each provides its own talk_to_me() instance. This allows us to trace the invocation order of constructors and to discover which virtual function is called within each constructor invocation.

     

    I would claim that when the nat_foo() base class constructor is invoked in the creation of a nat_foobar object, one of the following scenarios must occur:

     

    1. the talk_to_me() instance invoked within the nat_foo() constructor is that of the nat_foobar object being constructed.
    2. the talk_to_me() instance invoked within the nat_foo() constructor is that of the nat_bar object whose construction began but then became suspended in order to invoke the nat_foo() constructor.
    3. the talk_to_me() instance invoked within the nat_foo() constructor is that of the nat_foo() class itself.

    Which one is it? Well, if ontogeny recapitulates phylogeny [say that quickly 10 times in polite society], then (3) must be the behavior – that is, in the creation of a derived class object, the object becomes an instance of each of its base class ancestors in turn. The nat_foobar object becomes, in turn, a nat_foo object invoking the nat_foo instance of talk_to_me(), then a nat_bar object invoking the nat_bar instance of talk_to_me(), and finally it meets its destiny as a nat_foobar object invoking the nat_foobar instance. If we doubt this, we can hit the lights and run the demo:

     

    creating a native bar : foo {} instance ...

    inside nat_foo ctor: i am talking as a nat_foo ...

    inside nat_bar ctor: i am talking as a nat_bar ...

     

    creating a native foobar : bar : foo {} instance ...

    inside nat_foo ctor: i am talking as a nat_foo ...

    inside nat_bar ctor: i am talking as a nat_bar ...

    inside nat_foobar ctor: i am talking as a nat_foobar ...

     

    This was not the original behavior of cfront E, the first release of C++ outside of Bell Laboratories, nor was it the behavior of cfront 1.0, the first product release of C++ by the language product group. So, Stroustrup changed the behavior of the language at some point, and the question is why?

     

    One possibility, of course, is that Bjarne read the 1874 paper by the German biologist Ernst Haeckel on ontogeny and found it as compelling to his vision of the C++ language as Charles Darwin found Malthus’ views on population to his theory of speciation. To be honest, I have never asked Bjarne whether this is the case.

     

    Alternatively, one could probably independently deduce the reasoning behind the change by considering the following:

     

    (a)    A polymorphic class object is constructed from the inside out, or rather top-down in the hierarchical class order. In our example, the nat_foo sub-object of the nat_foobar object is first constructed [that is, initialized with state and resources], then the nat_bar sub-object [which contains the nat_foo subobject], and then finally the state and resources of the most derived object instance is constructed [that is said a bit sloppily but that is the nature of a blog …]

    (b)   A virtual method type-specific to a class is likely to refer to the state and resources of that class rather than those of an ancestral base class; that is the nature of a type-dependent function. [This is clearly the axiom, if you will, that is most vulnerable to argument, since I have no quantitative data as to the state and resources most typically referred to within a class virtual method.]

    (c)    Within a base class constructor that represents a sub-object of a derived class object, the state and resources of the derived class object itself are unconstructed and the resultant behavior of any reference to them is therefore undefined and outside the control of the programmer.

    (d)   Therefore, it is therefore highly probable that were the base class constructor to invoke the derived class virtual method, that method will make reference to undefined state members and resources, and thus result in anomalous behavior that will be difficult to localize and correct.

     

    And so this is why the resolution of a virtual function within the constructor of a class invokes the instance of that function active for that class and not for the most derived class under construction. The implementation challenges to this behavior are actually quite interesting, but the details of that will be postponed to part II of this entry.

     

    The question I want to ask is, if we turn the native class hierarchy to a managed class hierarchy, does the same behavior hold? Does ontogeny recapitulates phylogeny under .NET? Well, if the answer was anything but no, this blog would not be quite so pertinent. Let’s dim the lights and run the reel:

     

                creating a managed bar : foo {} instance ...

    inside foo ctor: i am talking as a managed bar ...

    inside bar ctor: i am talking as a managed bar ...

     

    creating a managed foobar : bar : foo {} instance ...

    inside foo ctor: i am talking as a managed foobar ...

    inside bar ctor: i am talking as a managed foobar ...

    inside foobar ctor: i am talking as a managed foobar ...

     

    Oh, wow. Isn’t that interesting. Under .NET, the constructors are invoked in the same evaluation order, but the semantics of virtual function resolution reflects choice (1) of the three choices presented above: it is always the instance that is associated with the class of the object under construction.

     

    There are a number of things to notice about this: (a) the behavior is less undefined than under native because the creation of an object on the native heap is accompanied by a zeroing out of the actual memory; however, the actual state values and resources are not as yet constructed, and (b) you know have the unintended consequence of multiple invocations of the same method during the construction of the various sub-objects, and these will vary depending on the type of the object being constructed.

     

    This difference of behavior, I would claim, is just about comprehended by the Oz/Kansas metaphor. This is one of the fundamental differences between the two object models and why designing one source base to be conditionally compiled as either native or managed is non-trivial. Of course, this example itself is trivial because the virtual function is directly invoked within the constructor and so is open to analysis during compilation. This gets less trivial when there is a call-chain within the constructor such that the presence of a virtual function is not apparent to the compiler. In that case, one must either generate code robust enough to handle the non-trivial case, or else just ignore it completely.   

  • Some Performance Notes on Virtual Function in ISO C++

    The elimination of the virtual mechanism in the invocation of a virtual function is in most cases trivial when measured against the elimination of any call invocation at all – that is, when the call is expanded inline. An inline expansion not only saves the overhead of the function call, but exposes a wider sequence of code to the aggressive peephole optimizer. This is because the virtual mechanism in C++ is not an operation of the executing environment but simply [in the simple case, anyway] the indirect execution of the function address stored in the associated virtual table. That is, one goes through the pointer to the virtual table within the pointer or reference to the polymorphic class object in order to execute the function addressed at a fixed slot within the virtual table [got that?].

     

    In C++, a programmer can suppress a virtual call in two ways: directly use an object of the class, in which case the polymorphism of the object is eliminated except in the trivial case in which the subtype hierarchy is the same size as the class of the object being directly manipulated. [The analogy under .NET, although it is not supported, would be toggling a reference type into a value type for some small program extent, eliminating the overhead of the managed heap and the virtual mechanism of the interface.] Obviously, this is a very special use of the polymorphic object, and is as likely to be an error on the programmer’s part as to be his intention. However, the ability to design first class value types – think of them as Abstract Data Types – and value type inheritance is something that I sorely miss under .NET, where complex value types are in my experience somewhat gimped. The second and more prevalent mechanism to suppress a virtual call is to invoke a class method through the fully qualified class scope operator. For example,

     

                WidgetExtension::display() { Widget::display(); /* now our specialized display */ }

     

    This pattern of localization within a call chain of a type-dependent method relies on the ability of the user to limit the number of methods invoked to the initial virtual instance, which can occur anywhere within the inheritance chain. The subsequent chain of base class calls are then inline expanded. Without explicit language support, the habit of programmers concerned with performance [I don’t have any hard data, so this is anecdotal] is to duplicate the base class code within the derived instance to achieve the same result. This of course tightly couples the implementation of the method with that of the base hierarchy and a single change in the state members can cause the whole thing to derail. [The state of OO optimization is not currently far enough along to guarantee the elimination of these calls although that is, of course, feasible in theory.]

     

    In order to maximize the effect of suppressing a virtual function, it is still necessary that the function be declared as inline. [Although the trend seems to be towards aggressive inlining of any method, it is not clear to me that that is yet guaranteed.] This strikes some people – declaring a virtual function as inline -- and some of these people speak and write publicly – as a gross mistake, arguing as follows: because the address of a virtual function [at least in current instances of all C++ implementations I know] is always placed within the virtual table of the associated class, by declaring the function as inline, the compiler will be forced to generate an out-of-line instance for which an address can be taken. In order to avoid this, therefore, a virtual function, by its very nature, should never be declared as inline.

     

    Sounds reasonable, but. It is only reasonable when the alternative to a non-inline declaration is an explosion of generated out-of-line function definitions. We know there has to be one in either situation – it is either explicit in the case of a non-inline virtual function, or implicit when declared inline. The question, then, is: how many more need to be generated? If the answer is none, then the objection to specifying the inline attribute is moot. As it happens, in the general case [making elbow room for the obligatory pathological case], it is possible for the compiler to limit the generation of both the virtual table and the out-of-line definition of the inline virtual functions to one definition in the single shared file in which the virtual table is generated. [At least this is what we did within cfront back in the late 1980s and so I presume it is a generally solved problem. I believe it was Andy Koenig who first proposed the solution; although it could have easily been Bjarne himself.]

     

     Object-Oriented programming is paradoxical in a way that raises passions beyond all reason. On the one hand, it promotes elegance and simplicity [the two are not synonyms although some people do not make that distinction] that is unrivaled. For example, a generic compositing engine that was in use at DreamWorks [the code is proprietary and so I have sufficiently dumbed it down out of all recognition] supported more than a dozen image formats with a generic routine such as the following (where move represents just one of many operations):

     

    void GenericEngine::move()

    {

         // Pseudo Loop code ...

         for clippedHeight and clippedWidth do

             source_engine->getNextPixel(&pixel);

             target_engine->setNextPixel(&pixel);

     

             source_engine->stepToNextRow();

             target_engine->stepToNextRow();

    }

     

    If you have ever looked at compositing code, this is actually sublime. The problem is that its performance was a dog. It took far too long to actually do the compositing. Fetching each pixel through a virtual function turns out to be non-viable. Consider: each frame of a film is composited at 2K resolution of an RGBA image format at 8 or16 bits per channel. [I may have completely zoned out on this detail.] A frame on average is composed of, say, 8 levels. [I just picked that out of a hat – Mickey’s Magician’s hat.] A film runs at 24 frames per second, and cartoons run about 88 minutes.

    This is the kind of design that talks well but can’t survive in a production environment. It is just not usable. In the heat of production [we’re talking over a $100 million dollars], the solution was to hack in an isA type member, switch on that member, and invoke the RGBA methods explicitly. Wow. That will not get you a conference paper, but it will keep your job at the studio.  [Listen, all the philosophy is the world about what is good programming means nothing if the project is outsourced to a third world country.]

    When I got my chance at being the technical lead at something called ToonShooter, our engineered compositing engine employed a template Strategy that allowed us to put back the elegance but eliminate the polymorphism:

    template<typename Engine>

    void Compositor<Engine>::move()

    {

         for clippedHeight and clippedWidth do

                source_engine->getNextPixel(&pixel);

                target_engine->setNextPixel(&pixel);

           source_engine->stepToNextRow();

           target_engine->stepToNextRow();

    }

     

    Interestingly enough, the hardest part of implementing the solution was getting programmers to think about it. Inheritance has become an almost universal solution. Programmers seem to shy away from using templates: they feel they are too complicated and difficult to manage. There is a sense that templates result in code bloat, while OO programming not only produces a smaller code footprint but also is fundamentally a better programming style. As it happens, OO code in general tends to generate a greater amount of run-time code. This becomes especially crippling in applications in which lots of objects are created, destroyed, and copied.


     

     

     

     

     

  • A Question of Const

    A reader, J. Daniel Smith, inquiries

     

     

    You wrote “The absence of const support in the base class library… “;

     

    Can you elaborate any on what the plans are for “const” as they relate to C++/CLI and C#?

     

    I know “const” can open up a whole can of worms about all the various semantics of “const”.  And some uses of “const” such as “const std::string& amp”; to avoid making a copy aren't needed with managed code.

     

    There are a number of issues, of course, with the const modifier, as you point out. It is used in at least three ways: (a) to indicate that an object is read-only, (b) to indicate that a parameter is being passed by reference only for the sake of efficiency and not because it will be modified as a side-effect of the method call – this is your const example in your third sentence, and (c) to indicate that a class method does not, as a side-effect of its invocation, modify any of the non-static class members.

     

    There is no entry for constant in the index of Jim Miller’s Common Language Infrastructure Annotated Standard, and there is no const modifiers on the Base Class Library API. This presents a problem for C++ if it wishes to present support for (b) and/or (c) within managed code. The choice made, which I argued against with the effectiveness of Don Quixote doing battle with his windmills, is to provide full support for (b), so that one can specify a const modifier to any parameter of a method.

     

    Designs in many cases are a series of accidents, and reflect non-conscious biases of its participants. I believe this design choice, for example, reflects the cyclopean vision of native C++ rather than an balanced Janus vision looking out at both the native and managed paradigms. Here are the problems that I see from the point of a .NET programmer.  

     

    (1)   it is not enforceable on any parameter subsequently passed to a Base Class Library method. Therefore it is not possible to insure the transitive closure on the constness of the call chain for each parameter. This is both the benefit and the `can of worms’ in using const parameters – the compiler can detect a break in the transitive closure of the call chain that we ourselves are for the most part clueless. [I say this is a `can of worms’ because it is really really difficult to backpatch const parameters once the application is over some threshold of complexity. Either you build it into your application from the start or else it is a grim imposition and, like dealing with template instantiation, is an example of where the language becomes an impediment rather than aid.] As with most things, there are two levels of discourse about const – the what you should do in principle level and what you do in practice. My gut feeling is that a majority of C++ applications do not use const well, but I have no real data and base this on informal conversations with students when I use to teach the language.

     

    (2)   in order to pass a const parameter to a Base Class Library method, one must cast away the constness of that parameter at each point of call, or create a local non-const entity. If you don’t, the compiler [I’m told] will flag the call as illegal since the Base Class Library method does not of course provide a const parameter. This is problematic for a number of reasons:

     

    (a)   const_cast is not permitted on all reference types, I believe, although I could be wrong, and a safe_cast of a reference type results in a run-time check that is expensive – static_cast is permitted but I am not sure whether it allows the casting away of const. The problem here is that this is very complicated and is not the level a serious programmer wishes to bend his or her attention to. It is error-prone and grim, imo.

     

    (b)   Casts are morally indefensible to many coding standards and project norms. They are often characterized as second-class citizens resorted to by programmers of dubious character, and can easily result in the greatest of managed sins: unverifiable code. And yet in order to make use of the Base Class Library, all const parameters must [the programmer has no choice] be cast away.

     

    (c)   The only real solution to this is to not use the const mechanism for parameters, which solves (a) and (b) and (1), but puts in question the value of our support for the feature to begin with.

     

    While these criticisms of the C++/CLI design choice regarding const are plausible, they may prove as insignificant as their reception. Their warrant rests on two axioms: (1) use of the Base Class Library is fundamental to .NET programming, and (2) use of C++/CLI will shift from interoperability between the two paradigms, which is apparently the primary use of the current language design, to the dynamic paradigm itself. The question then is, if and when that happens, will the use of C++/CLI be a first class experience? We don’t know the answer to that because as yet we have no practicable experience with the language and, I would argue, no real empathy as yet with the paradigm.

     

    Historically, the design and implementation of C++ in its infancy addressed many of its concerns to the needs of the C community. this pointer placement is a trivial but interesting example of this. In the original implementation, it was placed at the end of the class in order to accommodate the pattern of inheriting from a struct. At the point in time when Microsoft entered into the C++ arena, C-coddling was no longer the focus, and so we placed the this pointer at the beginning of the class object, which is more efficient for pure OO designs. At the language level, the language originally supported old-style function prototypes, took out nested types, and permitted implicit calls of methods signifying a return value of int and a variable argument list. One by one, these original compromises were taken from their home and never seen again. I believe it is fair to say that the design focus of C++/CLI is also in its infancy.

     

     

  • A Question about Multiple Inheritance Efficiency

    A reader, Johan Ericsson, comments

     

    re: Further Discussion of MI and general C++/CLI design issues

     

    I appreciate your discussion of the multiple inheritance issue.

     

    Is the performance problem of MI only inherent in virtual base clases?

     

    I've used MI in a lot of my code, but I've never found the need for virtual base classes. I often use MI as a refinement of an interface.

     

    Consider an interface class that will be implemented in a number of concrete classes. Most of those classes have the same implementation of many of the member functions of the interface. It has been convienient to provide a concrete implementation of the interface that can be used when the default behavior is desired. It's just a shame that I can't use the same techniques when writing .NET code. Instead, all members of an interface have to be explicitly implemented for each class that derives from the interface.

     

    Oh well, I know that I'm complaining about something that is not the biggest deal.

     

    I guess that with a common base class &quot;Object&quot;, it is not possible to avoid the need for virtual inheritance in MI. Otherwise, I would just be really happy with non-virtual MI capability in the CLR.

     

    Just to recapitulate. The overhead of virtual inheritance lies in the fact that the shared virtual base class sub-object is ephemeral in its position within each subsequently derived class, but only if it contains state members. Access to those member cannot be statically fixed [that is, a constant offset added to the beginning address of the object] but must be calculated at run-time. This fact led to a funny reversal of judgment by Stroustrup during his implementation of multiple inheritance.

     

    The definition of a class object at global scope requires that its associated constructor be invoked prior to the first statement of main. This is actually a difficult if uninteresting problem [that is, it is an engineering problem rather than a problem within computer science] – or at least it was back in the early 1980s, particularly when one had to be portable across all flavors of Unix, and, potentially, run on both the Macintosh and DOS [windows was just a twinkle in Alan Kay’s eyes at the time]. The implementation solution Stroustrup invented within the language was to generate an __sti function and an __std function [static initialization and static destruction] within each separately compiled program text file [think .C or .cpp or .cc] that contained global class objects.

     

    [As a challenge to the reader, can you come up with a strategy for naming these files such that one is guaranteed to generate unique names? Using the file name itself does not work, because in a complex project, files may have the same name but be positioned within independent directories. However, in the C++ language, there is one aspect of the language that is guaranteed to be unique within each program text file. That’s enough of a hint. [People don’t give Bjarne credit enough for just how smart he is because his genius is not abstract the way, say, Bertrand Meyers’ is; that is a prejudice that is pernicious and widespread in our culture and goes back to the Greeks.]]

     

    In any case, the second problem, once the unique std and sti files are littered across the .o binaries is to (a) recognize their presence, (b) thread them together somehow, and (c) have the set of them executed in order prior to the beginning of main. We came up with three solutions: an ascii solution when we had to be portable across all Unix systems – we used nm which dump out the symbol table and grep to find any files that began with __sti and __std [I told you this was not an active area of computer science – whenever I was bored, I would add an identifier which began as __sti_ha! and watch the system die an ugly death … ] and generates an array of pointers to these functions within a named pair of tables which were then iterated across within a _main() function we inserted as the first function within main. A colleague of mine back at Bell Labs named Robert Murray invented this `portable’ solution and called it munch, if memory serves. For the Vax and 3B20 processors [enough said], Rob provided an alternative solution which used the ld library to actually inspect the a.out’s symbol table in memory and thread the set of calls. Over time, folks out in the emergent C++ community sent us versions for other processors, such as Sun, Cray, etc. Finally, the underlying object format was extended to support initialization sections, and that is where the sti and std files were placed.

     

    So, with that background, we can get back to Bjarne’s funny reversal, and with that back to virtual inheritance which I wanted to briefly review before I answer Johan. What should be clear is that the sti/std functions provided a general solution for what is called static initialization – that is, initialization before the beginning of main. A number of folks had agitated Bjarne for an extension to C++ that would lift the C language restriction of global initialization to constant expressions. That is, to permit things such as

     

                // illegal in C and in C++ prior to Release 2.0

                int *pi = new int( 1024 );

                int j = *pi;

     

    Up until a certain point, the response had always been, no, which caused grumbling in certain circles. Then, one day, the grumbling stopped. The rule was relaxed. I believe, although I never explicitly asked Bjarne, that this is because the assignment of a derived class pointer or reference to a virtual base class pointer requires a run-time evaluation, and so it became more problematic to resist the more general case.

     

    The problem with the whole static initialization issue is that there is no support within the language to deterministically specify the order of initialization across files. This can cause difficult to trace segment faults since they occur prior to the beginning of main. The main programming solution is something called a Schwarz counter, named for Jerry Schwarz who invented to guarantee the initialization of cout, cin, cerr, and clog in his Release 2.0 implementation of iostreams. Because of metadata and the CLR abstraction layer, this is not a problem with .NET programming.

     

    OK. That’s preliminary to answering Johan’s question which asks about the overheads intrinsic to multiple inheritance. The overheads all revolve around the second and subsequent base classes, and this is because of (a) the necessary `this’ pointer adjustment to address the beginning of that base class subobject, and (b) the need for an additional virtual function table and virtual pointer within the derived class object [the arithmetic back then was that there is n-1 additional virtual tables [and the associated pointer to that table] where n is the number of base classes. In single inheritance, there are no additional tables.

     

    The this pointer adjustment is a pernicious overhead with regard virtual function calls – the two general solutions are either a thunk  [I believe this is the Visual C++ solution; it was at one point] or multiple entry points within a function in which the this-pointer adjustment takes place in one particular entry point [the quite fine IBM compiler on which Josee Lajoie worked chose this solution]. When people ask why Visual C++ did not provide the covariant return feature that was the first change voted on by the ANSI C++ committee, it is because of the difficulty of adjusting the this pointer under multiple inheritance on the returned pointer or reference to the class object. [It was finally implemented in Visual Studio 7.0, I believe. The name returned value optimization was finally implemented in the recent Visual Studio 2003, along with impressive conformance work, particularly on templates, that makes Visual C++ now one of the best C++ compilers.] The multiple virtual tables significantly complicates the implementations of pointer to member functions -- some people might say they cripple them.

     

    If you are interested in actual numbers and a detailed analysis of the implementation issues surrounding the C++ Object Model, both are available in my Inside the C++ Object Model, which was published in 1994 and represents all I knew [for better or worse] about C++ implementations at that time. I began because Bjarne graciously invited me to join his Foundation/Grail project to implement the Object Model phase of their compilation system, and I discovered that, apart from his Annotated Reference Manual, this was an undocumented area of the language and provided a way for me to systematize what I was learning. [I know the SGI C++ compiler team found it helpful.]

     

    At the time, all of us were making up implementations as we went along, and there has been an evolution in the virtual technologies, so to speak. For example, Visual C++ uses a virtual base class table analogous to the virtual table [and had even patented that]. Bjarne inserted virtual base class pointers within the class object. This turned out to be a mistake since, as the hierarchy deepens, access becomes more costly, which is an effect of the implementation and not of language feature itself. Some compilers that copied our implementation provide(d) a switch which allowed the user to choose between space and time in this regard – what they would do, when time was selected, was promote the virtual base class pointers to the derived class – but of course they could not elide the other instances and so you had a space hit of duplicated pointers. When I implemented the C++ object model in Bjarne’s Grail/Foundation compiler, he pointed out a superior implementation invented by Michael Tiemann [g++, Cygnus] which places the location of the virtual base class offsets within the virtual table. Michael Ball [Oregon Software/Sun] refined that to have the two grow in two directions, one hold the virtual base class offsets, the other holding the pointers to the virtual functions. [I’m sure the technology has evolved some since then; that was one of the beneficial side-effects of the standardization – having all the compiler people in one room – but in 1994 when Peter Weinberger on becoming head of Area 11 of Bell Labs canceled the Foundation project, I chose Disney Animation over IBM Yorktown and moved away from the C++ compiler community.]

     

    There have been a number of attempts to standardize the binary implementation of the object model from the triviality of internal name munching to the virtual mechanisms, but that has never succeeded, and has been imo a serious miscalculation by the community since it prevents interoperability. One of the innovations of .NET is a common infrastructure across languages, particularly when they conform to the CLS.

     

    Remarkably, this last comment brings me to a second question, this one from Daniel O’Connell:

     

     

    re: A Question about Copy Constructors in C++/CLI

     

    Hrmm, I can see the use and thanks for the explination of purpose, I am still concerned about the usage across other langauges, however like many language specific things, one will have to be careful when the intent is to provide a CLS compliant library. I brought the issue up primarily because of a recent newsgroup discussion[1]. Throughout this discussion I argued against copy constructors in the framework as a matter of technical issue, in a fairly similar(tho less efficent) manner as you have here.

     

    I will hunt other blogs and such sources to see if I can find any other information, hopefully somewhere I can find out if the feature will require ClsCompliant(falst) or any other specifics.

     

    The public specification, unfortunately, does not have any details with regard the behavior of the copy constructor [or at least none that I could find.] I have read certain internal documents on the implementation for review, but they are neither fresh in my mind nor internalized, and so I would only badly damage their meaning if I attempted to parrot them. The key blogs to attend to with regard the new language design are Herb Sutter, who leads the design team and has worked magic, and Brandon Bray. Mark Hall and Jonathan Caves are the key compiler folks connected with the design and are ultimately responsible for the implementation. And Jeff Peil also contributed significantly to the design, but I don’t believe the latter three are publicly talking, although they all have much worth saying.

     

  • A Question about Copy Constructors in C++/CLI

    Daniel O'Connell writes,

               

    re: Followup: The Absence of MI

     

    Hmm, Copy constructors...One would hope that the implementation and documentation is done *very* carefully(IE, marked as non-CLS compliant or implemented to work with ICloneable). I don't personally feel the pattern works well with non-C++ code(how many VB programmers are going to be instantly aware of it?), nor do I believe it works at all with interface based programming. Is there any other information on the implemntation or the advisories that will go along with copy constructors?

     

                re: Followup: The Absence of MI

     

    My last comment wasn't well written, I mean how will copy constructors deal with classes written in C# inherited in C++ then inherited in VB? how will it work as opposed to basic aggregation(Stream(Stream stream), for example)?

     

    His question deals with what I refer to as one of the intrinsics. The intrinsics are fundamental incompatibilities between the C++ and CLR object models. These are not random, but reflect a philosophical difference between the two models. Multiple Inheritance versus Single Inheritance + Interfaces is one such intrinsic. As that thread illustrated, the absence of direct support within the CLR object model for what within C++ is considered a natural mode of expression results in bereavement – that is, a sense of loss and inability to work efficiently.

     

    In the C++ object model, an object is always a chunk of memory that is the size of its state members + alignment constraints of the host processor + the size of any internally generated implementation pods [such as a virtual table pointer in all C++ compilers that I am aware of, or the virtual base class table pointer that is specific to Visual C++]. And this object is not polymorphic: if you assign a derived class to a base class object, the derived portion is sliced off and discarded. The base class sub-objects are either bitwise or member-wise copied depending on the member types. To exhibit polymorphic behavior, one uses either pointers or references of the object’s type.

     

    A reference, recall, is an ephemeral object. It binds to an object during initialization, then phase-shifts into an alias for that object. Assigning one reference to another is a deep copy – that is, either bitwise or member-wise copied. [Remember that initialization and assignment are distinct activities although they share a common operator, which can lead to confusion among programmers.] So it exhibits shallow copy during initialization, but deep-copy during assignment.

     

    A pointer is a machine abstraction inherited from the C-language. On modern architectures, all pointers are (I believe) simply an address. A pointer to void, a pointer to int, a pointer to Widget, and a pointer to Fred all look the same in terms of the size and value domain. The type associated with the pointer object is an instruction to the compiler regarding the size and interpretation of the bits beginning at that address. A cast, in general, simply changes the interpretation of the size and set of bits. [I say generally because sometimes it changes the address as well.] We toggle between acting on the pointer and acting on the object addressed by that pointer syntactically by using the dereference operator – and this is a common source of confusion to those unused to the idiom. A pointer exhibits shallow-copy – that is, only the address is copied either in the initialization or assignment of one pointer with another. Under assignment, we can toggle the pointers such that a deep copy of the objects addressed are copied, and the value of the pointer itself is unchanged.

     

    In the CLR, a managed reference type is a handle; that is, it is intrinsically polymorphic. The actual object is always on the managed heap as an unnamed entity allocated dynamically. Its removal from the heap is managed automatically through garbage collection and other than in exceptional instances is of no concern to the programmer. [This is the infamous non-deterministic finalization.] This is a staggeringly simplification, if you are a C++ programmer and are use to sweating the details of managing heap memory. [I suspect that programmers learning on a garbage collected system could not comfortably adapt to C++. It is like going from an automatic to a shift in driving a car. The migration path is really one way: going from the more complex to the simpler.] It exhibits shallow copy, and effecting a deep copy requires by convention the implementation of the Clone() member of the ICloneable interface which must be explicitly invoked by the user.

     

    So, the default copy behavior between the two object models one of the intrinsics. It is a fundamental difference in the physics between the native Kansas and the managed Oz. And its absence in the original language design was felt by many as a bereavement.

     

    The copy constructor mechanism in C++ is a way for the programming to gain control of the copying of one object with another when the default behavior is not appropriate. Let’s consider three questions:

     

    1. What is the default behavior and why was that chosen?
    2. Why is it necessary to sometimes manually override the default behavior?
    3. Why was the copy constructor and copy assignment operators chosen as the mechanism?

     

    In C, or in later versions of C, the meaning of copying one aggregate object to another was defined as bitwise copy. In the original implementation of C++, this was the default behavior as well – one has to be able to have a C program behave and execute with equivalence. One reason it is necessary sometimes to override the default behavior is because pointers exhibit shallow copy, which can cause a number of problems that need to be explicitly programmed against [for example, unconditionally applying the delete operator to the pointer member within the destructor]. The copy constructor and copy assignment operator were the chosen mechanisms for in a sense trapping and handling the inappropriate default behavior because their invocation could be injected transparently by the compiler. Once this mechanism was introduced, bitwise copy was no longer appropriate when member class objects were complex types with their own copy operators that needed to be invoked. Thus, in the mid-1980s, the default mechanism shifted from bitwise- to memberwise copy. However, as a quality of implementation issue, an aggregate type exhibiting bitwise copy semantics is still bitwise copied.

     

    In many circumstances, copy constructors are a royal pain. The classic example is when we return an aggregate object from within a function, as in

     

                Matrix operator+( const Matrix&, const Matrix& )

                {

                            Matrix result;

                            // do the math

                            return result;

                }

     

    The last thing one actually wants here is a copy construction of the local result into the return value, but there is no way to express that directly in the language. There is a language extension in the g++ variant of C++ but that was actively opposed for adoption into the language. A huge bruhahaha erupted within the ISO/ANSI C++ committee at one point regarding the legality of optimizing away the copy constructor when safe to do so, as in

     

                Matrix mat = a + b;

     

    but not in an assignment such as

     

                Matrix mat;

                mat  = a + b;

     

    which may or may not prove considerably expensive, but is generally always less efficient [I have to qualify it in order not to be called out for some corner case that may exist]. The problem with this optimization is that it becomes impossible to examine a piece of code and know its behavior – did the compiler optimize away the copy constructor? What happens if the destructor presumes that the copy constructor was invoked for some sort of increment/decrement handshake? Etc. etc. It has proved so confusing that Scott Meyers in More Effective C++ incorrectly described it [and as far as I know has never corrected it].

     

    The CLR reflects a different programming philosophy and tradition – think of SmallTalk as a point of origin, although that is not deeply thought out. However, in the the presence of garbage collection and intrinsic polymorphism of managed reference types [with default shallow copy], the entire copy constructor/copy operator mechanism is not really necessary – at least in the opinion of the inventors of the CLR, and I tend to agree. However, it is felt as a bereavement by native C++ programmers, and it is a pattern of usage should one wish to chase the grail of transparent code that compiles to either native or managed. It falls between the intrinsics of multiple inheritance and that of deterministic finalization in my opinion as to its importance in being simulated.

     

    Providing copy constructor support was not something I would have added to the C++/CLI language, and was not part of my prototype design [however, the urgency of my design was a proof of concept that the original language design was broken and that something better must replace it] and I had no part in its design or the thinking that led to its inclusion. I referred to it only because it is one of the intrinsics, and I was attempting to put multiple inheritance in a priority-based context.        

  • Further Discussion of MI and general C++/CLI design issues

    That same reader from the previous two posting responds as follows, with some editing. I am responding only because it gives me an excuse to speak of technical implementation issues, which I enjoy, and which readers often find interesting. With regard to my example of

     

                base *pb = pd;

     

    as one case in which the behavior of virtual base classes can be a significant bottleneck, the reader writes:

     

    So your example has indeed an impact, but other examples do not. Besides that, it also depends on the implementation of the MI below the surface how performance really is affected.

     

    Well, I would like to respond to this for two reasons. One is to illustrate the difficulty of answering someone who doesn’t wish a dialog, but wishes to push his or her own point. He began this discussion with the following thread:

     

    [ SBL ] “There are some significant implementation and performance problems with multiple inheritance - particularly virtual base classes which contain data members”

     

    [ Reader ]  That's up to the user to decide to opt for this 'performance hit' (which is not as significant as you make it).

     

    The reader generally pooh-poohed my suggestion that there are `some’ significant performance problems by suggesting that I was making these problems up. I gave an example, and he then said, well, ok, but that’s just one. Other examples do not. I could give others, but he can simply dismiss each one by saying, ok, but, as if he has a series of examples in his mind that he has not shared with us that have no performance overhead. So, I cannot win by providing a second or third example by itself. Rather, I will address the implementation issue that makes virtual base classes so unwieldy – for a more complete discussion [with some hard performance numbers] please see my text, Inside the C++ Object Model.

     

    The challenge of a virtual base class [or, rather, one that contains state] is that its position within the derived class(es) fluctuates with each non-trivial derivation, and therefore in a polymorphic use of the hierarchy containing a virtual base class, all access of the state members must be done at run-time. For example,

     

                class ios{ … };

                class istream : public virtual ios{ … };

                class ostream : public virtual ios { … };

     

    Ok. So both istream and ostream have access to the members of ios – what would happen if they were implemented, as `normal’ state members are, with fixed offsets? Well, consider this subsequent derivation:

     

                class iostream : public istream, public ostream {}

     

    At least one of the intermediate classes can no longer maintain the fixed offsets of the ios state members. In the general case, it is not practicable for fixed offsets to be maintained in the polymorphic case, and so all state access is a run-time rather than compile-time [that is, constant-expression] access. That is the nature of the beast.

     

    Similarly, the initialization of a virtual base class with state members is carried out by the most derived class. In the istream derivation, this means that istream initializes the ios state members. In the case of the ostream derivation, this means that ostream initializes the ios state members. However, when iostream is defined, it is responsible for initializing the ios state members, and the invocation of the ios constructor within the istream and ostream constructors must be suppressed.

     

    There are two levels of complexity with this design [it was not the original design, by the way, but was added to the language based on user feedback]. The first level of complexity is that of the class design itself: the most derived class must be cognizant of the initialization strategy of classes farther up the hierarchy than those of its immediate base classes; this can get quite complicated. The second level of complexity is in the implementation of the constructor suppression; while there are a number of strategies, these do add overhead either in time or space.

     

    The reader claims, with a through off similar to the one made earlier – which is not as significant as you make it – that besides, it also depends on the implementation of the MI below the surface, as if he knows of some optimal strategies that I am unaware of. I am relatively aware of the implementation strategies of compilers within the industry – edg, Microsoft, Sun, AT&T, etc. [I spent 6 years in computer animation and drifted away from C++ compilers, so maybe he knows particular implementations that I am unaware of – so I will leave that open for a further demonstration on his part]. But to my knowledge, the object model of a virtual base class with state requires overhead that can in many cases be significant.

     

    The one case in which the compiler can optimize out the overhead is when the virtual base class does not contain state members. And this is the case interfaces support.

     

    The other argument the reader makes is that “‘it should be up to the user to decide to opt for this ‘performance hit’” – and this is a reasonable statement, imo. This is why I always support the presence of multiple inheritance in ISO C++. I also then quote Grady Booch as regards his parachute quote – see the previous blog entry for that.

     

    The one caveat is that users often do not understand the overhead of virtual base classes with state members, and so do not make informed decisions. One demonstration of that is the following: when I use to teach C++ within industry, at least one developer would ask, if there is a possibility that a hierarchy may need at some undisclosed future time to multiply derive from a base class, shouldn’t I to be safe make all inheritance virtual?

     

    He then states,

     

    Furthermore: the choice should be there. Now I don't have a choice: it's the SI way or the highway. :). This is not right, although I know I can program 99% of the MI constructions using SI constructions.

               

    This is a fine concession from the reader. In his previous mail, he wrote

     

                [ Reader ]  Therefore, C++ is severily crippled.

     

    So, now it is simply a parachute issue with regard 1% of the designs. And that gets back to what I said initially: we are a smallish group doing a large task – although that may not seem likely seen from the outside. The truth is, we do not have unlimited resources. As another example of this, from Release 1.1 through Release 2.0, C++ was just myself and Bjarne, and he was the smart one. I did everything else. But from the outside, adversaries were complaining about the 800 lb AT&T guerilla.

     

    There are two major design efforts in bringing C++ to .NET:

     

    (a)   To map the CLR object model to a syntax that is reasonably intuitive for a C++ programmer and a .NET programmer [this is the Janus dilemma described in an early blog]. Sometimes this is very difficult. If you don’t think so, look at the original language design.

     

    (b)   To extend the CLR object model at the language level for aspects of native C++ that we feel are critical to the C++ programmer. Static templates, for example, were extended support managed types in addition to the dynamic generic templates of the CLR. This was considered critical. Similarly, we felt the absence of deterministic finalization was a serious problem with CLR programming, and so we choose to provide that as well. Both of these are non-trivial implementations. [The reader will no doubt protest at this point…] To us, these were clearly more important than multiple inheritance. Another issue that often comes up is default arguments. The CLR does not support them. We have chosen not to hide that absence from the user. Some people are very upset about that, but this is how you make calls in the real-world, and then you take your lumps. The important thing is to have thought it through and made a reasoned decision and gained consensus. [For example, I am saddened that the Hubble space telescope is going to be allowed to decay.]

     

    There is a third category of problems that have no good solution. For example, in my opinion, the CLR gets the calling resolution of virtual functions within constructors wrong. That is, in the base class sub-object constructor of a derived class constructor, the virtual instance invoked is that of the derived class, even though the derived class object itself is not initialized – although it is zeroed out. Why did they do this? My guess is because doing otherwise is hard. [I have a talk that explains why it is hard and how we solved it at Bell Labs. I’ll transcribe that at some point.] We haven’t solved that problem. The absence of const support in the base class library is also a problem with no obvious answer.

     

    That said, it is time for a confession: I misremembered Peter Jackson’s name and mispoke the reference to the director of the Lord of the Rings. Thanks for a [different] reader for pointing that out.

     

     

  • Followup: The Absence of MI

    That same reader from the previous posting now writes the following, where he quotes me and then makes his comment [this is great, by the way. this is much more engaging than the technical writing]:

     

    “There are some significant implementation and performance problems with multiple inheritance - particularly virtual base classes which contain data members”

     

     That's up to the user to decide to opt for this 'performance hit' (which is not as significant as you make it). There are also serious performance issues related to   datasets or to O/R mapping. You _don't have to_ use MI to write software, however in a SI world, things get complicated in several situations. I wrote a plea for MI in .NET here: http://weblogs.asp.net/fbouma/archive/2004/01/04/47476.aspx.

     

    Well, actually, the performance hit is quite significant. I base that on (a) supporting Stroustrup’s initial implementation under cfront for the 2.1 and 3.0 Release of the compiler, (b) implementing it on my own in an experimental compiler within the Grail project within Bell Laboratories headed up by Stroustrup, and (c) by doing a study of the implementations and costs in my book, Inside the C++ Object Model. For example, if the user writes

     

                base *pb = pd;

     

    if base is a virtual base class of the pd object, this requires to be carried out during run-time. If it is a global object, this requires the generation of a static initialization method that has to be executed before the beginning of main(). If this form occurs in enough modules, this can cause significant page faults at start-up, which is a significant blow to the application. The worst thing, however, is that the user can’t see the overhead in the statement since it looks trivial.

     

    It is always alarming when a person counters a statement by saying “it is not as significant as you make it” since it now becomes a he said, she said kind of dialog, which of course cannot be resolved. So, I will leave it at that.

     

    Besides that, C++ as a standard supports MI. You don't support MI on .NET's version of C++. Therefore C++ is severily crippled. You can give 10,000 excuses for that decision, that's not the point. The point is: it's not C++ anymore because you removed a CORNERSTONE of C++ from the language: if you want to use that feature, you have to go unmanaged. Now, isn't the future going towards a managed world? So I don't have a choice anymore?

     

    Well, the first part of this assertion is true. We don’t support MI on C++/CLI. [The you here is a bit inflated. My influence on the language is mostly by reputation.] It will have its own standard separate from that of ISO C++, and there are a number of differences between the two languages.

     

    The reader then says,

     

                Therefore, C++ is severily crippled.

     

    That therefore is not earned, although it has the pattern of a logical imperative. I have never used multiple inheritance, although, as I stated earlier, I have supported one implementation, and implemented a second. Tom Cargill has a long history of writing against the need for MI. Grady Booch has called it a parachute – something that might save one’s life in an emergency, but not something one uses every day. When I quoted that at an MI workshop at ECOOP, the developer of the Beta language shouted Hack, and everyone broke out laughing. I think that is my response to this statement of the reader.

     

    Multiple inheritance is not a cornerstone of the C++ language. It was introduced into the language in Release 2.0, in part because Brad Cox, the inventor of Objective C, said it couldn’t be done. Stroustrup has subsequently said that he wished he had done templates before MI. If we are going to simply make assertions, then my assertion is, Multiple Inheritance is a corner of C++, but hardly a stone’s throw from a dark corner at that. [This is called rhetoric J]

     

    In any case, clearly the reader is angry. It reminds me of an interview I read with Jacobson, the director of the Lord of the Rings trilogy. The woman asks, and what about this character? I really missed him. And he shrugs and asks, so you have been angry for two years? And then goes on and explains his reasoning. It also reminds me of a letter James Joyce responded to from a reader who complained at the lack of traditional characters in Ulysses. He responded that every book has its own inner logic that dictates its form, and that if you come to the book bringing your logic, there will be noise and frustration. Actually he didn’t quite say it that way. [And I mean noise here in the physics sense.]

     

    The reader, having squashed me on the one quote, then goes on to quote me again – it is an interesting aspect of a Blog that the quote is so fresh in my mind. Usually, I get a 2 to 10 year quote thunked across my puzzled brow.

     

    “Interfaces strike me as a potentially superior design; however, I don't have actual experience with their use. I'd like to see people gain some experience with interfaces before they claim a superiority with MI”

     

    You don't understand Interfaces clearly. You have two types of MI: multiple type inheritance and multiple implementation inheritance. .NET only supports the former, by offering the feature of multiple inheritance via interfaces. (Interfaces do support MI in .NET). This however is a 'hack', because although you have multiple type inheritance, you still can't inherit a given implementation of the interface from a given base class: you have to RE-implement the interface, due to the lack of multiple implementation inheritance.

     

    I've described in my blog about MI an abstract example and you can apply that abstract example to a lot of classes in .NET's API.

     

    Well, this is interesting. In my ECOOP workshop, the Europeans stereotypes Americans by complaining that they misused inheritance for implementation not type inheritance and that implementation inheritance is, at best, a hack, if not downright immoral. The classic example of implementation inheritance is the use of an array to implement a stack. To a certain kind of person, because the Liskov substitution principle does not hold, and the public interface of the array has to be suppressed from the user, this design is evil. It is an old debate, and one that has been lost in the CLR, at least in terms of MI.

     

    The bottom line is: the CLR has chosen a SI + interface model. C++ had chosen a different model. There is a gap between the two. And it is a design decision whether or not to attempt to bridge that gap. We have decided not to bridge this gap. We have, however, decided to bridge the gap between the CLR and ISO C++ in terms of deterministic finalization [destructors] and support for the copy constructor. These seems more fundamental to the language – real cornerstones.

     

     

     

     

     

    .

  • What about Templates and Multiple Inheritance

    A reader writes the following from the initial posting on this blog:

    re: The Revised C++ Language Design Supporting .NET -- Part 1

    “Probably the most conspicuous and eyebrow-lifting change between the original and revised design of the dynamic programming support within C++ is the change in the declaration of a .NET reference type.”

    I thought that the most conspicious and eyebrow lifting change was the absense of multiple inheritance and decent templates.

    Of course, on the surface level, the reader has simply misunderstood what I wrote in order to vent his frustration. I spoke of the most conspicuous change between the original and revised design of C++/CLI. Since neither multiple inheritance nor support of managed templates were provided within the original language design, their continued absence in the revised language could not be conspicuous – except in the paradoxical sense of `your silence is deafening.’

     

    I made this statement in order to point out the philosophical shift in the design of the C++/CLI language – that is, viewing the language as an additional paradigm to be supported by the language – in this sense similar to OO and generic programming (templates). This is the key to understanding the wholesale revision of the language.

     

    Another significant part of making the language first class is to provide support not just for managed templates, but to support the CLR generic features as well. Not only can one declare a managed type as a template, or as a generic, but we will also provide an STL.NET library. I did not mention this previously because my blog is limited to speaking to the mapping of the original language to the C++/CLI revision. Of course, new features have no mapping.

     

    The second issue, which is the support of multiple inheritance, is problematic for a number of reasons. The first, of course, is that the CLR, like Java and Smalltalk, does not support it directly. Thus, any implementation will need to internally flatten it. The question then is, has a need for it been demonstrated to warrant the expense of personnel? Currently, the answer is no.

     

    There are some significant implementation and performance problems with multiple inheritance – particularly virtual base classes which contain data members. Interfaces strike me as a potentially superior design; however, I don’t have actual experience with their use. I’d like to see people gain some experience with interfaces before they claim a superiority with MI.

     

    That said, there is a model of MI that we are looking at, which supports combining native and managed classes in a very interesting way. But that is in the future, and outside the boundaries of this blog. Look to the blogs of Herb Sutter and Brandon Bray for more information as time passes.

  • C++/CLI: __try_cast<> becomes safe_cast<>

    What’s Different in the Revised Language Definition?

    __try_cast becomes safe_cast<>

     

     

     

    Modifying an existing structure is a much different and, in some sense, a more difficult experience than crafting the initial structure; there are fewer degrees of freedom, and the solution tends towards a compromise between an ideal restructuring and what is practicable given the existing structural dependencies. If you have ever typeset a book, for example, you know that making corrections to an existing page is constrained by the need to limit the reformatting to just that page: you cannot allow the text/code/figure/table to spill over into subsequent pages, and so you cannot add or cut too much (or too little), and it too often feels as if the meaning of the correction is compromised in favor of its fit on the page.

     

    Language modification is an obvious second example. Back in the early 1990s, for example, as Object-Orienting programming became an important paradigm, the need for a type-safe downcast facility in C++ became pressing. Downcasting is the user-explicit conversion of a base-class pointer or reference to a pointer or reference of a derived class. Downcasting requires an explicit cast because, if the base class pointer is not a kind of derived class object, the program is likely to, well, do really bad things. The problem is that the actual type of the base class pointer is an aspect of the runtime; it therefore cannot be checked by the compiler. Or, to rephrase that, a downcast facility, just like a virtual function call, requires some form of dynamic resolution. This raises two questions:

     

    1. Why should a downcast be necessary in the Object-Oriented paradigm? Isn’t the virtual function mechanism sufficient in all cases? That is, why can’t one claim that any need for a downcast (or a cast of any sort) is a design failure on the part of the programmer?

     

    1. Why should support of a downcast be a problem in C++? After all, it is not a problem in object-oriented languages such as Smalltalk (or, subsequently, Java and C#)? What is it about C++ that makes supporting a downcast facility difficult?

     

    A virtual function represents a type-dependent algorithm common to a family of types (I am not considering Interfaces, which are not supported in ISO C++ but are available in C++/CLI and which represent an interesting design alternative). The design of that family is typically represented by a class hierarchy in which there is an abstract base class declaring the common interface (the virtual functions) and a set of concrete derived classes which represent the actual family types in the application domain. A Light hierarchy in a Computer Generated Imagery (CGI) application domain, for example, will have common attributes such as color, intensity, position, on, off, and so on. One can pepper one’s world space with a fistful of lights, and control them through the common interface without worrying whether a particular light is a spotlight, a directional light, a non-directional light (think of the sun), or perhaps a barn-door light. In this case, downcasting to a particular light-type in order to exercise its virtual interface is unnecessary and, all things being equal, ill-advised. In a production environment, however, things are not always equal; in many cases, what matters is speed. One might choose to downcast and explicitly invoke each method if by doing so an inline execution of the calls can be exercised in place of going through the virtual mechanism. So, one reason to downcast in C++ is to suppress the virtual mechanism in return for a significant gain in runtime performance. (Note that the automation of this manual optimization is an active area of research. However, it is more difficult to solve than replacing the explicit use of the register or inline keyword.)

     

    A second reason to downcast falls out of the dual nature of polymorphism. One way to think of polymorphism is being divided into a passive and dynamic pair of forms. A virtual invocation (and a downcast facility) represents dynamic uses of polymorphism: one is performing an action based on the actual type of the base class pointer at that particular instance in the execution of the program. Assigning a derived class object to its base class pointer, however, is a passive form of polymorphism; it is using the polymorphism as a transport mechanism. This is the main use of Object, for example, in the pre-generic CLR. When used passively, the base class pointer chosen for transport and storage typically offers an interface that is too abstract. Object, for example, provides roughly five methods through its interface; any more specific behavior requires an explicit downcast. For example, if we wish to adjust the angle of our spotlight or it’s rate of fall off, we would need to downcast explicitly. A virtual interface within a family of sub-types cannot practicably be a superset of all the possible methods of its many children, and so a downcast facility will always be needed within an object-oriented language.

     

    If a safe downcast facility is needed in an object-oriented language, then why did it take C++ so long to add one? The problem is in how to make the information as to the run-time type of the pointer available. In the case of a virtual function, as most people know by now, the run-time information is set up in two parts by the compiler: (a) the class object contains an additional virtual table pointer member (either at the beginning or end of the class object; that’s has an interesting history in itself) that addresses the appropriate virtual table – so, for example, a spotlight object addresses a spotlight virtual table, a directional light, a directional light virtual table, etc. and (b) each virtual function has an associated fixed slot in the table, and the actual instance to invoke is represented by the address stored within the table. So, for example, the virtual Light destructor might be associated with slot 0, Color with slot 1, and so on. This is an efficient if inflexible strategy because it is set up at compile-time and represents a minimal overhead.

     

    The problem, then, is how to make the type information available to the pointer without changing the size of C++ pointers, either by perhaps adding a second address or directly adding some sort of type encoding. This would not be acceptable to those programmers (and programs) that choose not to use the object-oriented paradigm – which was still the predominant user community. Another possibility was to introduce a special pointer for polymorphic class types, but this would be awfully confusing, and make it very difficult to intermix the two – particularly with issues of pointer arithmetic. Nor would it be acceptable to maintain a run-time table associating each pointer with its currently associated type, and dynamically updating it.

     

    The problem then is a pair of user-communities which have different but legitimate programming aspirations. The solution needs to be a compromise between the two communities, allowing each not only their aspiration but the ability to interoperate. This means that the solutions offered by either side are likely to be infeasible and the solution implemented finally to be less than perfect. The actual resolution revolves around the definition of a polymorphic class: a polymorphic class is one that contains a virtual function. A polymorphic class supports a dynamic type-safe downcast. This solves the maintain-the-pointer-as-address problem because all polymorphic classes contain that additional pointer member to their associated virtual table. The associated type information, therefore, can be stored in an expanded virtual table structure. The cost of the type-safe downcast is (almost) localized to users of the facility.

     

    The next issue concerning the type-safe downcast was its syntax. Because it is a cast, the original proposal to the ISO/ANSI C++ committee used the unadorned cast syntax, so that one wrote, for example,

     

                spot = ( SpotLight* ) plight;

     

    but this was rejected by the committee because it did not allow the user to control the cost of the cast. If the dynamic type-safe downcast had the same syntax as the previously unsafe but cast static cast notation, then it becomes a substitution, and the user has no ability to suppress the runtime overhead in cases where it is unnecessary and perhaps too costly.

     

    In general, in C++, there is always a mechanism by which to suppress compiler-supported functionality. For example, we can turn off the virtual mechanism by either using the class scope operator (Box::rotate(angle)) or by invoking the virtual method through a class object (rather than a pointer or reference of that class) – this latter suppression is not required by the language but is a quality of implementation is similar to the suppression of the construction of a temporary in a declaration of the form

     

                X x = X::X( 10 ); // compilers are free to optimize away the temporary …

     

    So the proposal was taken back for further consideration, and a number of alternative notations were considered, and the one brought back to the committee was of the form (?type), which indicated its undetermined – that is, dynamic nature. This gave the user the ability to toggle between the two forms – static or dynamic – but no one was too pleased with it. The third and successful notation is the now standard dynamic_cast<type>, which was generalized to a set of four new-style cast notations.  

     

    In ISO C++, dynamic_cast returns 0 when applied to inappropriate pointer type, and throws a std::bad_cast exception when applied to a reference type. In the original language design, applying dynamic_cast to a managed reference type (because of its pointer representation) always returned 0. __try_cast<type> was introduced as an analog to the exception throwing variant of the dynamic_cast, except that it throws System::InvalidCastException if the cast fails.

     

    In the revised language, __try_cast has been recast as safe_cast, and its definition is provided within the stdcli::language  namespace (it is not a keyword as are the other four cast notations). For example, here is a code fragment in the original language (with either some nifty or confusing look-ahead at changes to the declaration and use of a managed array),

     

    public __gc class ItemVerb;

    public __gc class ItemVerbCollection

    {

    public:

          ItemVerb *EnsureVerbArray() []

          {

                return __try_cast<ItemVerb *[]>(verbList->ToArray(__typeof(ItemVerb *)));

          }

    };

     

    Here is the same code fragment in the revised language,

     

    using namespace stdcli::language;

    public ref class ItemVerb;

    public ref class ItemVerbCollection

    {

    public:

          array<ItemVerb^>^ EnsureVerbArray()

          {

                return safe_cast<array<ItemVerb^>^>(verbList->ToArray( typeid<ItemVerb^> ));

          }

     

    };

     

    [Notice, too, that __typeof has been replaced by an additional form of typeid that returns a Type^ when passed a managed type, where the template notation distinguishes the managed from that of the native form of the operator. This integrates the two analogous operations with an analogous syntax and avoid introducing a new keyword. In the introduction of the gcnew operator, you see an alternative design solution; that is, one moving from a transparent reuse of the new operator (with an optional __gc modifier in cases of ambiguity) in the original language design to the introduction of a separate keyword (to clearly indicate the separate heaps from which the allocation is being made).]

     

    To finish this entry, we need to return to that trade-off design space with which I opened this blog entry. It is the same one that led to the introduction of the new-style notation.

     

    One the one-hand, in the managed world, it is important to allow for verifiable code by taming the ability of programmers to cast between types in ways that leave the code unverifiable. This is a critical aspect of the dynamic programming paradigm represented by C++/CLI. For this reason, instances of old-style casts are recast internally as run-time casts, so that, for example,

     

                // internally recast into the equivalent safe_cast expression above

    ( array<ItemVerb^>^ ) verbList->ToArray( typeid<ItemVerb^> );

     

    On the other hand, because polymorphism provides both an active and a passive mode, it is sometimes necessary to perform a downcast simply to gain access to the non-virtual API of a subtype. This can happen, for example, with the member(s) of a class that wish to address any type within the hierarchy [passive polymorphism as a transport mechanism] but for which the actual instance within a particular program context is known. In this case, the system programmer feels very strongly that having a run-time check of the cast is an unacceptable overhead. If C++/CLI is to serve as the system programming language of .NET, it must provide some means of allowing a compile-time [that is, static] downcast. This is provided in the revised language with the static_cast notation:

     

    // ok: cast performed at compile-time. No run-time check for type correctness

    static_cast< array<ItemVerb^>^>( verbList->ToArray( typeid<ItemVerb^> ));

     

    The problem, of course, is that there is no way to guarantee that the programmer doing the static_cast is correct and well-intentioned; that is, there is no way to force managed code to be verifiable. This is a more urgent concern under the dynamic program paradigm than under native, but is not sufficient within a system programming language to disallow the user the ability to toggle between a static and run-time cast.   

     

    What are the weaknesses of the design as it stands? Well, I see two, in ascending order of being problematic.

     

    1. The purpose of the new cast notation is to call attention to the severity of the cast being performed – static_cast means that it is well-defined, but still requires some manual intervention by the programmer. It is the most benign of casts. Morally, it is the closest to goodness. Then there is const_cast. This is less good. We are throwing away something that prevents modification. This is potentially very disruptive to the semantic integrity of the program, and so we label it as such. [An historical aside: during one of the betas of cfront Release 2.0, Stroustrup removed the ability of a user to cast away const in the old-style cast notation. We discovered that none of our beta users could compile their code-base without support for casting away const, and that none of them were so alarmed at the presence of their const-casting as to feel the need to recast their code. Thus, in a subsequent beta, support for const-casting was reintroduced.] And then of course there is that villainous no-count reinterpret_cast that casts a pall of perfidy on the programmer making use of it. If we are to believe that verifiability is an important aspect of .NET programming, then the use of static_cast as the violating notation is at best misleading. It fails to register what is going on. If the idea behind the use of safe_cast is to indicate a verfiable cast, then the static toggle for performance should be either unsafe_cast or down_cast. In this case, I suggest the design is being penny-wise in its saving the cost of additional contextual keyword and pound foolish.

     

    1. The result of this design is a performance C++/CLI trap and pitfall. In native programming, there is no difference in performance between the old-style cast notation and the new-style static_cast notation. But in the new language design, the old-style cast notation is significantly more expensive than the use of the new-style static_cast notation since the compiler internally transforms the use of the old-style notation into a run-time check that throws an exception. Moreover, it also changes the execution profile of the code because it results in an uncaught exception bringing down the application – perhaps wisely, but the same error would not cause that exception if the static_cast notation were used. One might argue, well, this will help prod users into using the new-style notation. But only when it fails; otherwise, it will simply cause programs that use the old-style notation to run significantly slower with no visible understanding of why, similar to the following C programmer pitfalls:

     

    // pitfall # 1: initialization can remove a temporary class object, assignment cannot

    Matrix m;    

    m = another_matrix;  

     

    // pitfall # 2: declaration of class objects far from their use

    Matrix m( 2000, 2000 ), n( 2000, 2000 );

    if ( ! mumble ) return;

     

     

    A New Tradition: Question of the Day

     

    It is the nature of science – and language design is a serious part of computer science – that participants disagree, and that debate follows in which one participant or the other may prove wrong. Einstein, for example, did not accept Bohr’s theory of quantum mechanics and periodically shot off challenges to its correctness, and Bohr successfully deflected each missive. While it is far too preposterous to suggest that these two points above have any relationship beyond the most trivial to the great physics debate of the past century, it does the suggest a first question in a new blog tradition I would like to initiate: to ask an interesting question about historical scientific endeavors.

     

    Bohr’s original quantum model, which he devised to explain the behavior of the hydrogen atom, did not scale to atoms which contained multiple electrons. Heisenberg and Schrodinger provided competing theories explaining a generalized quantum mechanics. Each dismissed the other’s work publicly in rather harsh terms as hogwash. In a similar but more elegant manner, Einstein dismissed quantum theory, and misspent most of his life in America failing to disprove it (if he was in a company he would no doubt have been dismissed or demoted as unproductive at his hiring level). All three were wrong in their criticisms. Why does Einstein’s incorrect criticism rebound to his credit, whereas the criticisms of both Heisenberg and Schrodinger are mere personal failings all too familiar when competing scientists evaluate the work of others?

     

    disclaimer: This posting is provided "AS IS" with no warranties, and confers no rights.     

     

     

  • Conversion Operators

    What’s Different in the Revised Language Definition?

    Conversion Operators

     

     

     

    Speaking of that grotty feel to the language, having to write op_Implicit to specify a conversion just didn’t feel like C++ in the T1 language design. For example, here is a definition of MyDouble taken from the T1 language specification:

     

    __gc struct MyDouble

    {

       static MyDouble* op_Implicit( int i );

       static int op_Explicit( MyDouble* val );

       static String* op_Explicit( MyDouble* val );

    };

     

    This says that, given an integer, the algorithm for converting that integer into a MyDouble is provided by the op_Implicit operator. Moreover, that conversion will be carried out implicitly by the compiler. Similarly, given a MyDouble object, the two op_Explicit operators provide the respective algorithms for converting that object into either an integer or a managed String entity. However, the compiler will not be carried out the conversion unless explicitly requested by the user.

     

    In C#, this would look as follows:

     

    class MyDouble

    {

       public static implicit operator MyDouble( int i );

       public static explicit operator int( MyDouble val );

       public static explicit operator string( MyDouble val );

    };

     

    And apart from the weirdness of the explicit public access label for each member, the C# code looks a lot more like C++ than the Managed Extensions to C++ does. When I was writing C# code, I could usually guess at the meaning of a construct without going to the book and that tickled my C++ basic instincts. When I write T1 code, I can never figure things out [it seems] without cracking open the book, or asking one of the experts here. Not only does that make me feel dumb, but it breaks my coding `zone’, and I don’t feel as good about what I’m doing [that is, my code], which is not good if I spend a good part of my day coding.

     

    So we had to fix that. But how should we do that?

     

    On one hand, C++ programmers are left slightly reeling by the absence of a single argument constructor being construed as a conversion operator. On the other hand, however, that design proved grotty enough to manage that the ISO committee introduced a keyword, explicit, just to reign in its unintended consequences – for example, an Array class which takes a single integer argument as a dimension will implicitly convert any integer into an Array object even when that is the very last thing one wants. Andy Koenig was the first person who brought that to my attention when he explained a design idiom of a dummy second argument to a constructor just to prevent such a bad thing from happening. So I don’t regret at all the absence of a single constructor implicit conversion semantic for C++/CLI.

     

    On the other hand, it is not ever a good idea to provide a conversion pair when designing a class type within C++. The best example for that is the standard string class. The implicit conversion is the single-argument constructor taking a C-style string. However, it does not provide the corresponding implicit conversion operator – that of converting a string object to a C-style string, but rather requires the user to explicitly invoke a named function – in this case, c_str().

     

    So, associating an implicit/explicit behavior on a conversion operator [as well as encapsulating the set of conversions to a single form of declaration] seems an improvement on the original C++ support for conversion operators, which has been a public cautionary tale since 1988 when Robert Murray gave a Usenix C++ talk  entitled Building Well-Behaved Type Relationships in C++ and which led to an explicit keyword. The revised T2 language support for conversion operators looks as follows:

     

    ref struct MyDouble

    {

    public:

          static operator MyDouble^ ( int i );

          static explicit operator int ( MyDouble^ val );

          static explicit operator String^ ( MyDouble^ val );

    };

     

    where the default behavior for the operator is to support an implicit application of the conversion algorithm, freeing us from having to introduce yet another contextual keyword (yack!).

     

     

    disclaimer: This posting is provided "AS IS" with no warranties, and confers no rights.     

     

     

  • Declaring Delegates and Events

    What’s Different in the Revised Language Definition?

    Delegates and Events

     

     

     

    The only change to the declarations of a delegate and a trivial event is the removal of the double underscore, as in the following sample. As these things go, this has proved to be completely non-controversial (J). That is, there have been no advocates for the retention of the double underscore, which everyone now seems to agree gave the original language a somewhat grotty feel.  

     

    // the original language (T1)

    __delegate void ClickEventHandler(int, double);

    __delegate void DblClickEventHandler(String*);

     

    __gc class EventSource {

                __event ClickEventHandler* OnClick; 

                __event DblClickEventHandler* OnDblClick; 

     

                // …

    };

     

    // the revised language (T2)

    delegate void ClickEventHandler( int, double );

    delegate void DblClickEventHandler( String^ );

     

    ref class EventSource

    {

          event ClickEventHandler^ OnClick;

          event DblClickEventHandler^ OnDblClick;

    // …

    };

     

    Events (and delegates) are reference types, which is more apparent in T2 due to the presence of the hat (^).  Events support an explicit declaration syntax as well as the trivial form. In the explicit form, the user specifies the add(), raise(), and remove() methods associated with the event. (Only the add() and remove() methods are required; the raise() method is optional.)

     

    Under the T1 design, if the user chooses to provide these methods, she must not provide an explicit event declaration, although she must decide on a name for the event that is not present. Each individual method is specified in the form add_EventName, raise_EventName, and remove_EventName, as in the following example taken from the T1 language specification:

     

    // under the original T1 language

    // explicit implementations of add, remove, raise …

     

    public __delegate void f(int);

    public __gc struct E {

       f* _E;

    public:

       E() { _E = 0; }

     

       __event void add_E1(f* d) { _E += d; }

     

       static void Go() {

          E* pE = new E;

          pE->E1 += new f(pE, &E::handler);

          pE->E1(17);

          pE->E1 -= new f(pE, &E::handler);

          pE->E1(17);

       }

     

    private:

       __event void raise_E1(int i) {

          if (_E)

             _E(i);

       }

     

    protected:

       __event void remove_E1(f* d) {

          _E -= d;

       }

    };

     

    The problems with this design are largely cognitive rather than functional. Although the design supports adding these methods, it is not immediately clear from looking at the above sample exactly what is going on. As with the T1 property and indexed property, the methods are shotgunned across the class declaration. Slightly more unnerving is the absence of the actual E1 event declaration. (Once again, the underlying details of the implementation penetrate up through the user-level syntax of the feature, adding to the apparent lexical complexity.) It simply labors too hard for something that is really not all that complex. The T2 design hugely simplifies the declaration, as the following translation demonstrates. An event specifies the two or three methods within a pair of curly braces following the declaration of the event and its associated delegate type, as follows:

     

    // the revised T2 language design

    delegate void f( int );

    public ref struct E

    {

    private:

          f^ _E; // yes, delegates are also reference types

     

    public:

          E()

          {

                _E = nullptr; // note the replacement of 0 with nullptr!

          }

     

          // the T2 aggregate syntax of an explicit event declaration

          event f^ E1

          {

          public:

                void add( f^ d )

                {

                      _E += d;

                }

     

          protected:

                void remove( f^ d )

                {

                      _E -= d;

                }

     

          private:

                void raise( int i )

                {

                      if ( _E )

                           _E( i );

                }

     

          }

          static void Go()

          {

                E^ pE = gcnew E;

                pE->E1 += gcnew f( pE, &E::handler );

                pE->E1( 17 );

                pE->E1 -= gcnew f( pE, &E::handler );

                pE->E1( 17 );

          }

     

    };

     

    Although people tend to discount syntax as non-glamorous and trivial in terms of language design, it actually has a significant if largely unconscious impact on the user’s cognitive experience of the language. A confusing or inelegant syntax increases the hazardousness of the development process in much the same way that a dirty or fogged windshield increases that of driving. In T2, we’ve tried to make the syntax as transparent as a highly polished, newly installed windshield.  

     

    disclaimer: This posting is provided "AS IS" with no warranties, and confers no rights.     

     

     


© 2009 Microsoft Corporation. All rights reserved. Terms of Use  |  Trademarks  |  Privacy Statement
Microsoft
Page view tracker