|
|
C++/CLI
-
In the last entry, I celebrated what I felt was an elegant solution to the problem of the string literal in the context of overload function resolution. But it turns out there is another area in which the string literal proves problematic. Who would have thought such a foobar kind of entity could cause so much trouble? It's these kinds of ambushes that makes the extension of a language so unpredictable. So, here's the problem. throw "fritz"; what does the compiler do? it has to set everything up at compile-time, then it actually gets handled by a run-time library that makes use of a limited form of type reflection to figure the kind of object having been thrown. there's no context, really. we don't know if the person is going to catch x, y, or z, so we can't know what to throw in any particular instance. this is all set up at compile-time. Now, here is the rub: void f() { try { g(); } catch ( String^ s ) {} catch ( const char* pcs ) {} } void g() { throw "fritz"; } What gets caught? what an impossible question. one could argue if we are compiling for the CLR, then String^ should match, and if we are in native, and so on. But there's no reason to believe that just because we are compiling for the CLR, people are using String and not just const char*. So, a string literal is thrown as a type const char*, not as String^. My first reaction was, gosh, that's terrible. Then, I thought, what am I thinking? It seems a very small matter. It's not even something I would run into since I don't throw anything but class objects. I just add it for completeness. I think if this bothers you, then you are bothered by our entire effort. Just as a sounding. That is, not everybody is happy with our work on C++/CLI. There are some folks out there that feel concern that we are somehow messing with their language, and we shouldn't be allowed to do that. we aren't doing it that way at all. We have really wanted to be good c++ citizens while being excellent Microsoft employees. I have never felt any pressure to compromise either while I have been here. We have had an extraordinary degree of freedom not simply in our design, but in our being able to reach out and work with the general c++ community. this language is a coalition. I think we have all wanted to put the best face on C++ in what we regard as an otherwise hostile environment for C++. We think this is a win-win situation for everyone. if you don't like something, you should let us know. we're not a hundred thousand leagues removed from our users. if you want to use the language, you have every right to tell us what you think about it; how you find it; what you want. if you make a good case for it, I think it could be done. that's just my opinion. but I think we have an opportunity to together as a C++ community put forth C++ in an otherwise hostile environment that for some unbelievable reason thinks C# or Java comes anywhere near being as good a language. I don't understand it. It's not a question of knocking those languages. They took so much from C++ -- it's not hard, after all the hard stuff was worked out, to go back and clean it up. sure. C++ is messy. it was real hard to do. it was getting done while we were using it. we didn't know where it was going. Bjarne was implementing it, using it, it became the tool we all used to make our living. we wanted it to be the best possible language for programmers. at least I did. I don't have a philosophy. I don't make standards. I just program and write. and I do that best in C++. C# and Java mean nothing to me. Now I have my own language to use on .NET. That was my personal agenda in all this. I think you should check it out. it was a great adventure. I find C++/CLI is something interesting again about C++ for me anyway. I'm not a fan of standards. I've never finally been involved in one. It's making a language that interests me, not in clean-up and all that necessary stuff for real-world production. I don't have the head for that. so, I just wanted to make a personal statement. I have watched the team here for almost 3 years. I worked with them sometimes. I snarled at them sometimes. I was fed up with them once or twice. and yet they have produced something really wonderful in my opinion, and it is much better than I had dared imagine. I just love it. I can't wait to program with it – I'd like to drop everything and just start. this is exactly what I have been waiting for. to have a language to program .net. we didn't have one before. C++/CLI is a entry visa nto what I call 3-dimensional programming. The CLR gives you the run-time as an entity to query, to modify, to almost in a sense be alive in. everything in a native program has to be done at compile-time. that is way too early for me to know exactly what needs to be done in many interesting environments. can't I wait? I'm not that performance critical that I need the machine. my first analogy was going to be between the traditional Disney animation and the Pixar CGI, but I think a better analogy is between an instinct and a decision. that is, an instinct is the trigger of a built-in behavior that is insulated from and unaffected by the environment. a decision considers the environment (context) and memory (past behavior) in order to choose a best action. this is what is missing from native static programming. it really can't be smart enough for the kinds of problems we need to solve now in my estimate. I can't prove that. I gave a talk at Disney Feature Animation a few months ago about C++/CLI, and a friend of mine who I love dearly, asked me, what do I need that for? he is in charge of plug-in 3D tools for Maya, a modeling toolkit for computer graphics. And he is right. But we need it I believe as a community going forward. and in that sense, we have opened that doorway for the C++ programmer. The worst case scenario was that the price of admission to this world was to have to renounce C++ -- I mean, mea culpa. we're such savages – we worry about performance, we break the type system, we make use of the underlying machine, we don't follow enough rules. we're loose cannons. Can't have loose cannons. I personally didn't want to accept that as being a true statement. But you cannot argue something like that. I can't say, hypothetically, C++ can be as good a language under .NET as Java or C#. Nobody would hear that – they would just see Lippman being defensive, or delusional. you have to demonstrate it. That's the only way to really convince people. That's the only way I can ever convince myself in fact. So, that is what C++/CLI means to me. I know Bjarne has his reservations and dislikes. Well, so do we all, if truth be told. But it's still just so wonderful overall. who cares if it's not perfect. I don't – I haven't gotten anything perfect, trust me. i don't know if it is better than this or that or the next thing. but it's really pretty good and i'm going to use it. and we really did it as a team.
|
-
In the course of these entries, I have twice addressed the issue of the type of a string literal under C++/CLI -- in particular when resolving an overloaded function call. The issue is illustrated in the following example, public ref class R { public: void foo( System::String^ ); // (1) void foo( std::string ); // (2) void foo( const char* ); // (3) }; void bar( R^ r ) { // which one? r->foo( "Pooh" ); } In the original Managed Extensions for C++, the invocation of foo() within bar() resolved to (3), exactly the same as it does under ISO-C++. That is, void bar( R^ r ) { // under Managed Extensions to C++, resolved to // void foo( const char* ); r->foo( "Pooh" ); } To briefly review: In ISO-C++, the type of "Pooh" is const char[5]. There is no exact match of "Pooh" to any of the three instances of foo(). However, the trivial conversion of const char[5] to const char* represents a best match, and this is why (3) is invoked. There was no built-in notion of a string literal having any relationship to System::String. And this was changed in the design of C++/CLI. Actually, it was changed twice, and that is the talking point of this entry – to explain why the initial change had to be further refined. The overall effect of the change is to extend dual citizenship to a string literal compiled for the CLI. The initial change is described in my earlier entry entitled String Literals are now a Trivial Conversion to String. Here is a brief review of the issue. The first question is, what is the exact type of "Pooh" within C++/CLI? One answer is, well, obviously, it is of type const char[5] – otherwise, it could not be compatible with ISO-C++. We can't change that. The initial solution, therefore, was to introduce a new trivial conversion, that of a string literal to a System::String, that is of equal precedence with the trivial conversion of a string literal to const char*. This provides a somewhat elegant symmetry, but in practice results in a flurry of ambiguous calls. For example, under this design, the invocation of foo() now fails, void bar( R^ r ) { // under interim C++/CLI, flagged as ambiguous // the following two candidate functions are equally good … // void foo( System::String^ ); // void foo( const char* ); r->foo( "Pooh" ); } To disambiguate the call, the user would have to provide an explicit cast, void bar( R^ r ) { // ok: void foo( System::String^ ); r->foo( safe_cast<String^>( "Pooh" )); } In practice, in nearly every case, the C++/CLI programmer wished to have the String instance invoked in preference to the C-style string instance. And so giving equal precedence to both conversions was both a step forward (in recognizing the special relationship of a string literal to System::String under the CLI) and two steps back (the presence of the const char* argument in effect neutralized that relationship (first step back) and required an explicit cast to resolve (second step back) ). So, we had to fix that. That is, under the CLI, we want a string literal to more closely be a kind of System::String than const char*. The question was, how could that be achieved without breaking ISO-C++ compatibility? How might you resolve that? The insight to resolve this is to realize that the dual citizenship of a string literal applies to its fundamental type, not to its set of trivial conversions. In effect, under C++/CLI, the underlying type of a string literal such as "Pooh" is both const char[5] (its native inheritance) and System::String (its managed underlying unified type). Under C++/CLI, the string literal is an exact match to System::String and the trivial conversion to const char* is not considered. That is, under the revised C++/CLI language specification, the ambiguity has been resolved in favor of System::String, void bar( R^ r ) { // ok: under current C++/CLI, // void foo( System::String^ ); r->foo( "Pooh" ); } This reflects a fundamental difference between ISO-C++ and C++/CLI in their type systems. In ISO-C++, types are independent except when explicitly part of the same class inheritance hierarchy. Thus, there is no implicit type relationship between a string literal and the std::string class type, even though they share a common abstraction domain. C++/CLI, on the other hand, supports a unified type system. Every type, including literal values, is implicitly a kind of Object. This is why we can call methods through a literal value or an object of the built-in types. The value 5 is of type Int32. The string literal is of type String. It just doesn't work to treat a string literal as either more like or equal to a C-Style string. The integrated conversion hierarchy allows a working ISO-C++ program to continue to exhibit the same behavior when compiled for the CLI, while a new C++/CLI program exercising the CLI types reflects the new type priority of the string literal. Most readers and programmers have little patience with this level of detail, and often point to discussions of C++ type conversions as evidence of its complexity. However, I don't think that is quite fair. The existence of these rules is necessary if one is to reach an intuitive language behavior and guarantee a uniform behavior across implementations. It is because the language generally behaves in a type-intuitive way that allows programmers in fact to ignore these details. While the length of this discussion might seem disproportionate to the topic's importance, it strikes me as a canonical example of the extent to which we have had to work to integrate the CLI type system into the ISO-C++ semantic framework. This should also suggests certain good practices when a native class is being recast to a CLI class type. It is better, for example, to refashion the set of member functions accepting string literals rather than simply stirring in an additional String instance to our stew of overloaded functions, seasoning it to local taste.
|
-
A value type is typically not subject to garbage collection except in two cases: (a) when it is the subject of a box operation, either explicit or implicit, such as, void f( int ix ) { // explicit placement on the CLI heap int^ result = gcnew int( -1 ); // exercise result … // implicit boxing of the integer value ix … Console::WriteLine( "{0} :: {1}", ix, result ); }; and (b) when it is contained within a reference type either as a member or as an element of a CLI array. For example, public enum class Color { white, red, orange, yellow, green, blue, indigo, violet }; public value class Point { Color m_color; float m_x, m_y; // … }; public ref class Rectangle { Point bottom_left; Point top_right; // … }; void f( Point p ) { Rectangle ^hit = gcnew Rectangle( -2, -2, 2, 2 ); array<int> ^fib = gcnew int(8){ 1,1,2,3,5,8,13,21 }; // … } In f(), we allocate two whole objects on the CLI heap, a Rectangle and an array of eight integer elements. The Rectangle object contains two interior value class Point members. These are located on the CLI heap as fixed offsets into the area allocated to the containing Rectangle. Within each Point are two floating point members and a Color member. These are also located on the CLI heap as fixed offsets into the two interior Point objects. If the Rectangle object is relocated during a sweep of the garbage collector, all its interior members, of course, are relocated as well. The same is true with a CLI array. When the array is relocated, the addresses of each of its elements change as well. We cannot safely assign the address of any interior member or array element to a non-tracking pointer or reference. This isn't just a pitfall that we must learn to recognize and sidestep. Rather, the language disallows it. Our attempt results in a compile-time error. For example, // error: cannot assign an interior member // to a non-tracking pointer … Point *p_bl = hit->BottomLeft; We need a form of tracking entity to hold the address of an interior member. What are some of the requirements on this entity? Well, one common pattern we felt necessary to support is that of an iterator – in particular when applied to the elements of a CLI array. For example, void f( array<int> ^fib ) { SomeTrackingPointerNotation begin = &fib[0]; SomeTrackingPointerNotation end = &fib[ fib->Length ]; for ( ; begin != end; ++begin ) // … } Our SomeTrackingPointerNotation needs to support pointer arithmetic. That is, when we write ++begin, this does not increment the address by 1, but rather by the size of the element type. For example, since array element are of type int, each increment must add sizeof(int) to the current address value. Neither the tracking handle (^) nor its indirect cousin, the tracking reference (%), supports pointer arithmetic. Moreover, a tracking reference does not support pointer comparison, such as begin != end Like a native reference, a tracking reference, once initialized, serves as an alias to the underlying object to which it refers. The comparison is not of the two tracking addresses but of the values stored at those addresses. This is not the iterator semantics we need. In a native design, we decide whether to go with a pointer or a reference declaration based on two primary factors: (1) if the object we wish to refer to is unavailable at the time of declaration, then we must declare a pointer and set it to null. A reference requires an initial object. So does a tracking reference. (2) if we wish to refer to more than a single object during the lifetime of the declaration, then we must also declare a pointer. A reference cannot be reset to refer to a second or subsequent object. Neither can a tracking reference. What this suggests is that we need an analogous choice when the constraints of a tracking reference make it ill-suited to our design. A third, more flexible form of tracking entity, the interior pointer (interior_ptr<>), is given over to this role. It can refer to no object (but only by setting it to nullptr, not 0). And it can be reset to refer to a second or subsequent object. Moreover, it supports both pointer arithmetic and pointer comparison. For example, int sum( array<int> ^arr ) { if ( ! arr ) return 0; interior_ptr<int> begin = nullptr; interior_ptr<int> end = &arr[ fib->Length ]; int sum = arr[0]; for ( begin = &arr[1]; begin != end; ++begin ) sum += *begin; return sum; } The declaration of an interior pointer is limited to local objects, including function parameters and return types. If we don't provide an initial value, the compiler automatically inserts code to set it to nullptr – so the explicit initialization in the above example is not strictly necessary. The type specified within the template brackets identifies the kind of object addressed; we do not indicate a pointer within the brackets unless we intend two or more levels of indirection. For example, public ref class Matrix sealed { float *m_mat; public: property interior_ptr<float*> Mat { interior_ptr<float*> get(){ return &mp_mat; } } // … };
|
-
C and C++ programmers are notorious for relying on pointer indirection, and it seems blog entries are not immune to this. A translation guide attempting to exhaustively detail the differences between the original Managed Extensions for C++ (released with Visual Studio.NET) and the revised C++ binding to the CLI scheduled for Visual Studio 2005 (and attempting to provide some motivation behind each change) has been posted on MSDN at the following URL: http://msdn.microsoft.com/visualc/default.aspx?pull=/library/en-us/dnvs05/html/TransGuide.asp Although there has been considerable effort made in correcting all errors within the text, I am sadly aware that how imperfect these pieces of mine nevertheless turn out. So, on the one hand, I believe this guide will prove valuable to those needing this information. On the other hand, I also believe there are areas that (a) could have been made clearer, and (b) details that … well, that are in error. If you do use this guide and find either (a) or (b) (or some (c), (d), or (e) not itemized), please let me know, either by a comment to this entry or by a private email. Speaking of which – comments, that is. Due to the open nature of the internet, the providers of this site have found it necessary to provide a form of budget firewall – that is, they have put in place a facility to moderate the comments. This basically means that I see each comment before it becomes public, and it requires a click of my mouse in order for it to be published. I have not – not as yet, anyway – not published a comment; however, oddly enough, I can't prove that. In any case, those of you whom have been reading this from the beginning will notice that patchwork pieces of the translation guide have been posted here in one form or another. Often, comments have helped me recognize shortcomings and recast material in a hopefully more comprehensible manner. For example, my treatment of deterministic finalization in the translation guide was considerably reworked to address some of the posted comments to the initial blog entry on that topic. As W.H.Auden said in a different context, a piece of writing is never finished, it is simply abandoned.
|
-
Part of the (reasonably pleasant) distractions from posting on this blog recently has been working up the first in a series of articles on STL.NET for our Visual C++ MSDN web site. The amount of work to get from an articulation of a topic to a formal publication of it is an amazingly labor-intensive 10% -- similar to the difference between prototyping a software solution and making it deployment ready. In any case, this relatively content-free entry is just to alert you of its going on-line at http://msdn.microsoft.com/visualc/?pull=/library/en-us/dnvs05/html/stl-netprimer.asp?frame=true If for some reason, this doesn't show up in the post as a clickable link, you can just visit the visual c++ subportion of the msdn site at http://msdn.microsoft.com/visualc and hopefully find a link to it there. In any case, the Visual C++ site, under the care and breeding of Brian Johnson and Ami Vora, has really been spiffed up with some very neat content and is worth a lookie-loo. For the article, David Clark, who did a wonderful job editing the piece, asked me to come up with a summary limited to 200. I misread that as 200 words, and wrote the following. I then discovered to my chagrin that it referred to 200 characters, including white space. I thought, well, white space is without content, so if I remove that, I get perhaps another 75 characters to play with, but that doesn't actually work … In any case, here is the summary that had to be sliced mercilessly: For the experienced programmer, the hardest part of moving to a new development platform such as .NET is often the absence of familiar tools through which she has honed her skills and on which she depends. For the experienced C++ programmer, one such essential toolkit is the Standard Template Library (STL), and its absence under .NET until now has been a significant disappointment. With Visual C++ 2005, we fix that by providing an STL.NET library. This article, the first in a series, provides a general overview of the STL program model using STL.NET – it discusses sequential and associative containers, the generic algorithms, and the iterator abstraction that binds the two, using plenty of program examples to illustrate each point. It begins by briefly considering the alterative container models available to the .NET programmer using C++ -- the existing System::Collections library, the new System::Collections::Generic library, and, of course, STL.NET. To provide for the widest readership, this article does not require familiarity with the STL library; however, it does presume some experience with the C++ programming language. This summary, when reduced to 200 characters, plus the white space i threw back in, ended as follows: With Visual C++ 2005, the Standard Template Library (STL) has been re-engineered to work under the .NET Framework. This article, the first in a series, provides a general tour of STL.NET. Talk about a poor relative. In any case, this article is culled from a text I am writing on the C++ binding to the CLI, so if you have any concerns or comments or a content wish-list, please drop me a line.
|
-
Recently, someone asked me why we support both class and typename within C++ to indicate a type parameter since the keywords do not hold any platform significance – for example, class is not meant to suggest a native type nor is typename meant to suggest a CLI type. Rather, both equivalently indicate that the name following represents a parameterized type placeholder that will be replaced by a user-specfied actual type. The reason for the two keywords is historical. In the original template specification, Stroustrup reused the existing class keyword to specify a type parameter rather than introduce a new keyword that might of course break existing programs. It wasn't that a new keyword wasn't considered -- just that it wasn't considered necessary given its potential disruption. And up until the ISO-C++ standard, this was the only way to declare a type parameter. Reuses of existing keywords seems to always sow confusion. What we found is that beginners were whether the use of the class constrained or limited the type arguments a user could specify to be class types rather than, say, a built-in or pointer type. So, there was some feeling that not having introduced a new keyword was a mistake. During standardization, certain constructs were discovered within a template definition that resolved to expressions although they were meant to indicate declarations. For example, template <class T> class Demonstration { public: void method() { T::A *aObj; // oops … // … }; While the statement containing aObj is intended by the programmer to be interpreted as the declaration of a pointer to a nested type A within the type parameter T, the language grammar interprets it as an arithmetic expression multiplying the static member A of type T with aObj and throwing away the result. Isn't that annoying! (This sort of dilemna is not possible within generics – there is no way to safely verify that any T contains an A so that the runtime can safely construct an instance of the generic type.) The committee decided that a new keyword was just the ticket to get the compiler off its unfortunate obsession with expressions. The new keyword was the self-describing typename. When applied to a statement, such as, typename T::A* a6; // declare pointer to T’s A it instructs the compiler to treat the subsequent statement as a declaration. Since the keyword was on the payroll, heck, why not fix the confusion caused by the original decision to reuse the class keyword. Of course, given the extensive body of existing code and books and articles and talks and postings using the class keyword, they chose to also retain support for that use of the keyword as well. So that's why you have both.
|
-
I've been recently puzzling out a strategy for presenting the two mechanisms supporting parameterized types available to the C++/CLI programmer: she can use either the template mechanism adapted for use with CLI types, or the CLI generic mechanism. This is not unique to the support of parameterized types, of course, but it seems a lightening rod for pernicky questions:
(1) isn't the support for two mechanisms that are similar in intent but which differ in both gross and subtle semantic behavior confusing for a user?
(2) doesn't the dual nature of these constructs increase the liklihood of programmer error?
(3) isn't this a canonical illustration of the undisciplined nature of the C++ language design where everything but the kitchen sink seems to get thrown in? you guys just can't get your act together, can you?
Let's see what kind of answers I can offer. [Disclaimer: of course, these are my thoughts, and do not represent either the corporate views or policies of Microsoft.]
The C++ binding to the CLI which I refer to as C++/CLI represents an integration of two separate object models: the static object model of native C++ and the dynamic program model of the CLI. We've seen conflicts between these two models before – in particular between the native and CLI enum, the native and CLI array, and the native and CLI reference class.
Under the CLI object model, individual languages are – not to put too fine a point on it – somewhat diminished – much as how in a modern country, the individual states while soverign are constrained to the laws of the central authority. For example, the CLI defines the underlying type system within which a language operates, as well as the inheritance model. As we saw in an earlier blog, the CLI does not, for example, support private inheritance, value inheritance (that is, the inheritance of implementation but not of type), or multiple inheritance (MI). While a language can choose to support these aspects of inheritance, that support requires a mapping onto the existing CLI object model because there is no direct support.
The Eiffel language under CLI, for example, choose to provide an MI mapping because it felt that (a) MI is a valuable inheritance model, and (b) its users would be dimished without its support under the CLI – that is, it would give its users a dimished programming experience under the CLI than on native platforms, and this would likely deter them from migrating to the CLI itself – or at least of migrating to the CLI while continuing to exercise their Eiffel expertise and culture.
We did not feel the same imperative as Eiffel with regards to multiple inheritance. But we did feel that imperative towards deterministic finalization of reference types declared within a local block, and so we provided a mapping of a class destructor to a IDisposable::Dispose() method, which is the CLI pattern of reclaiming resources prior to garbage collection finalization. Similarly, we did feel the imperative towards automatic memberwise copy and initialization – as supported by a copy assignment operator and copy constructor – and so we provided a mapping. (But these mappings are constrained by the underlying CLI implementation. We could not map memberwise copy into a value class because we could not guarantee that it could be carried out in all circumstances – at least that is my understanding. I haven't myself verified that but taken it on faith.)
Again, we did this because without these mappings of essential aspects of the native C++ programming experience, we believe the C++ user would have a diminished experience of programming under the CLI than on the native platform, and that this would deter them from migrating to the CLI – or at least of migrating to the CLI while continuing to exercise their C++ expertise and culture. The template mechanism is another of the essential aspects of modern C++ programming. We believe its absence would represents a significant hole in quality of programmer life when using C++/CLI. Personally, that is my deep belief.
So, with regard to parameterized types, it felt imperative that we provide some mechanism beyond what was offerred by the System::Collections namespace. The first mechanim that naturally comes to mind to the C++ programmer, of course, is templates. But what about generics? Why couldn't C++/CLI use generics for containing the CLI types, and leave templates for the non-CLI types? Why map the template mechanism into C++/CLI to support the CLI reference class, value class, interface class, delegate, and function?
The honest answer is because we were left with no choice. One of the generic talking points in presentations, specifications, and hallway whispering, is that generics, while borrowing from C++ template, "does not suffer many of the complexities" of C++ templates.
What are considered as superfluous complexities of C++ templates and were therefore eliminated from the generic mechanism – partial template specialization, the ability to inherit from the type parameter, support for non-type and template type parameters, the ability to specialize either an entire or selected members of a template, and so on – are considered by professional C++ programmers and designers of the language as essential modern programming design patterns that are fundamental to existing production code and widely-used libraries, such as the STL, LOKI, and Boost.
The real problem is that although the C++ community and language designers and implementors have deep experience with parameterized types, that experience was not tapped while the design of generics were underway.
So, we did not have any choice but to provide support of templates for CLI types and to provide an STL.NET implementation. This is great stuff if you care about C++ and want to see it succeed under .NET. Except for performance issues, C++/CLI, in my biased opinion, is shaping up as the premier C++ experience available. Personally, I'm so keen on the new language that I'm planning to reimplement my mscfront translator into the C++/CLI code from native. I'll be reporting my progress on that in quite some detail in a series of blog entries once I get the C++/CLI text I'm working on in shape.
So, that's why we have templates. Why did we also provide support for generics? Generics are deeply integrated into the CLI, and for that reason solve a number of problems left unsolved in C++ – in particular, the instantiation model. Because there is no concept of a runtime within native C++, there is no native concept of how a template is instantiated – that is, when the binding of an actual parameter to the formal parameter occurs and to what extent. The work of the ISO committee in this area has not been stellar.
Generics provides a constraint mechanism, something whose absence is keenly felt in the template mechanism. A generic type is recognized by the CIL – the intermediate language; a template class is not, and so template classes are not cross-language and, it turns out, not cross-assembly as well. That is to say, every serious CLI language has to provide generic support. And that is what we do. Perhaps if we had been participants in the design, the outcome could have been different. But that, to repeat my aunt's favorite refrain, is water under the bridge.
So, from my perspective, that is why we support both the template and generic solutions for parameterized types, and have tried to integrate them into a elegant symmetry.
|
-
Sorry it is taking me so long these days. I am in the throes of more formal writing – a book on our CLI binding for C++, and a series of articles for our Visual C++ MSDN website on STL.NET. And my translation tool is happily going through a formal test cycle – thank you Mitchell and Arjun – and so I've been fixing bugs and being a developer once more again. So the blog gets bogged down.
I thought I would follow up on the issue of overload function resolution. And so this entry discusses how a candidate function list is built up. (I realize this is somewhat esoteric, but. Well, I hope someone finds it worth a spin around the page.)
The first question of course, is, what the heck is a candidate function list? Literally, a candidate function list is the set of functions sharing the same name visible at a call point. As we'll see, the complexity, when present, has to do with determining the actual set of visible functions.
It's always good to begin with a simple example – hopefully, if we can master this one, our confidence will surge, and we can march on to the question of namespace, qualified type names in the signature of a function, and the using declaration in general. Here is our example,
void f(); // (1)
void f( String^ ); // (2)
void f( const string& ); // (3)
void f( String^, array<String^>^ ); // (4)
int main( array<String^> ^args )
{
if ( args == nullptr )
handle_invalid_command_line();
for each ( String^ s in args )
f( s );
}
There are four candidate functions to the call of f() within main(). By inspection, we can see that only two of them are viable – that is, only the (2) and (3) instances can match the actual invocation. (And (2) is the best viable function.)
So that was simple. Things start getting somewhat more complicated if the type of a function argument is declared within a namespace. Let's first look at a fully qualified name. In this case, the functions within the namespace that have the same name as the called function are added to the set of candidate functions. For example:
namespace CLITypes
{
public ref class C {…};
void takeC( C^ );
}
// …
void f( CLITypes::C^ cobj )
{
// ok: calls CLITypes::takeC( C^ )
takeC( cobj );
}
There is no takeC() function declared within the global scope in which f() is defined. There is, as well, no using declaration opening up a namespace. So, on first glance, this appears to be an illegal invocation: the candidate set appears to be empty.
However, because the argument is qualified to occur within the CLITypes namespace, the functions declared within that namespace are considered as well. That is, the full set of candidate functions under this circumstance represents the union of the functions visible at the point of call and the functions declared within the namespaces of the argument types.
There are three general cases in which functions with the same name do not overload – at least currently. (There is some activity within the ECMA committee that hasn't jelled as yet one way or another – at least as far as I'm aware.)
1. A derived class function that reuses the name of a base class virtual function overrides rather than overloads the base class instance. Except when the new slot modifier is applied, the derived class instance must conform to the signature of the base class function. It substitutes for the base class instance within the derived class virtual table.
2. A derived class function that reuses the name of a non-virtual base class function hides rather than overloads the base class instance. The signatures of the base and derived class instance are not considered. This is because the overload candidate set for a function does not extend across scope boundaries. (This is the guy currently under siege, I believe.)
3. A function declared within a local block hides rather than overloads all named instances of that function within the enclosing scopes for the extent of that block. This is the most esoteric of the three cases, so let me provide a quick example,
String^ Marshall( int ); // (1)
String^ g() {
{
// these puppies hide global instance …
String^ Marshall( double ); // (2)
String^ Marshall( char* ); // (3)
return Marshall( 1024 ); // resolves to (2)
}
In this example, the global instance of Marshall() is not visible within g(); the candidate functions are limited to the two declarations within g() itself. The char* instance is not a viable candidate function for an actual argument of 1024. This leaves us with instance (2) match the formal parameter of type double through a standard conversion although the global instance, if considered, represents an exact match.
The candidate functions also depend on the visibility of using declarations at the call point. This is because a using declaration opens up a namespace. For example,
namespace libs_R_us {
int max( int, int );
double max( double, double );
}
char max( char, char );
void func()
{
// namespace functions not visible
// the three calls resolve to global max( char, char )
max( 4096, 8192 );
max( 35.1, 35.9 );
max( 'J', 'L' );
}
In this case, the only function visible is the function declared in global scope. It is therefore the only candidate function, and is the instance invoked by all three calls within func(). This results the loss of precision in both arithmetic invocations. We have two choices for correcting this, both of which make use of a using directive to open the namespace. The question is where we should place it.
One possibility is to place using declaration in global scope. For example,
char max( char, char );
using libs_R_us::max; // using declaration
All three instances of max() are now visible within the global scope and are placed in the set of candidate functions. The three invocations are now each an exact match to a separate instance, as follows,
void func()
{
max( 4096, 8192 ); // libs_R_us::max(int,int);
max( 35.1, 35.9); // libs_R_us::max(double,double);
max( 'J', 'L' ); // ::max( char, char );
}
Alternatively, we might choose to place the using declaration within the local scope of func(). Why would we do that? Primarily to limit the extent of the changes in our program due to the larger set of candidate functions. By adding to the candidate function set at global scope within an existing program, we are potentially changing the function invoked at each call point that does not involve an exact match. This may be a more invasive change than what we are ready to support. The alternative declaration looks as follows,
void func()
{
// local using declaration
using libs_R_us::max;
// same function calls as above
}
Surprisingly, we get a different the set of candidate functions now. This is because using declarations nest. With the using declaration in local scope, the global function is now hidden. The only visible functions at the call points are the two declared within the namespace, and so our character comparison resolves to the namespace instance max(int,int) through a promotion of the two character arguments.
There are two possible solutions to getting the three functions into the candidate set. We originally choose a nested using declaration in order to localize the inclusion of the functions to just the call points within func(). One solution, of course, is to move it back to global scope. But this opens the entire assembly to potential change. Alternatively, we can add the global instance to our nested set of declarations,
void func()
{
// now we have all three in the candidate set
using libs_R_us::max;
extern char max( char, char );
// same function calls as above
}
The set of candidate functions are therefore the union of the functions visible at the point of the call — including the functions introduced by using declarations and using directives — and the member functions declared in the namespaces associated with the types of the arguments. For example,
namespace basicLib {
void print( String^ );
void print( Object^ );
}
namespace matrixLib {
public ref class Matrix { /* ... */ };
void print( Matrix^ );
}
void display()
{
using basicLib::print;
matrixLib::Matrix ^mObj;
print( mObj ); // matrixLib::print( Matrix^ )
print( "literal" ); // basicLib::print( String^ )
print( 1024 ); // basicLib::print( Object^ )
}
Which functions are the candidate functions for the call print(mObj)? The two basicLib functions introduced by the local using declaration are candidate functions because they are visible at the point of the call. Because the function call argument is of type matrixLib::Matrix, the print() function declared within the namespace matrixLib is also a candidate function.
Once the candidate functions are identified, the next step – which begins to involve type checking – is to determine the viable functions within the candidate set. That topic is a candidate for a subsequent blog.
|
-
A reader questions the nature of the value type when he writes,
Sender: Slawomir Lisznianski ===================================== 1) Lack of support for SMFs makes value classes unnatural to use. An example in the C++/CLI spec at page 33 is incorrect, as it uses constructors with value classes. In fact, quite a few value class examples in the specification contradict with paragraph 21.4.1.
SMF, for the uninitiated, means special member functions, and in this case refers to the constraint on a value class that it cannot declare a default constructor, copy constructor, copy assignment operator, or destructor.
Mechanically, the reason these special member functions are not supported, I am told, is because there exists conditions during run-time in which it is not possible for the compiler to insert the appropriate invocations, and thus it is not possible to guarantee the semantics associated with these member functions. And so their support has been withdrawn completely. I suspect that the examples were written before the withdrawal of the default constructor, and the authors of the spec simply overlooked removing them.
There are a number of negative responses one can have to this: Disbelief, disgust, savage anger are a few that come to mind.
Another way of looking at this is to consider the why and when these special member functions are not required. We do not need a copy constructor nor a copy operator when the aggregate type supports bitwise copy. Similarly, we do not need a destructor when the state of the aggregate type exhibits value semantics. Finally, if the runtime zeros out all state by default, then we do not require a default constructor. (In C++, primitive data types are not automatically zeroed out, and so most of our default constructor use – but granted, not all – is used to put the object in an uninitialized state.)
That is, a value type in the philosophy of the CLI unified type system is a blitable entity with no internal plumbing, so to speak. That is all it naturally supports.
You put a pointer in it, you got troubles – there are no special member functions to provide deep copy semantics or to free the resource addressed prior to the end of its lifetime. Let's not even consider attempting to declare complex member types. That's not what you do with value classes.
I will claim that they are not unnatural. What is unnatural, but understandable presuming that you have a C++ background, are the sophisticated uses you think to put these rather unsophisticated types. When you think value class, think integer. Then things will begin to click for you.
I will address your second question in a subsequent blog: So what's the rationale for trackable references ... ?
|
-
A reader asks,
Sender: Jack
re: String Literals are now a Trivial Convers | |
|