|
|
-
On comp.lang.c++.moderated, Peter
Lundblad wrote:
-
I also don't see the need for gcnew. Why not use placement new, i.e.:
T^ h = new (CLI::gc) T(14);
This would require an extension allows placement new overloads to return
something other than void* and a standard way to get to the raw storage where to construct
the object. That extension, however, could be useful for other things as well, i.e.
a new that returns a smart pointer so you don't have to have a public constructor
on the smart pointer taking a raw pointer which is dangerous.
There are several reasons why we didn't use a form of placement new.
One reason is that we wanted to leave a door open in case in the future we wanted
to allow placement and class-specific forms of gcnew. Having a parallel gcnew expression
and operator best serves leaving that door open.
Another reason is that existing libraries, including GC libraries, already use placement
forms of new, and so many of the possible placement names are taken.
In particular, new (gc) X; is already taken by the Boehm collector.
Yes, I know you suggested CLI::gc instead of plain gc,
but in practice I'm still concerned that enough people are liable to frequently write using
namespace CLI; (actually stdcli) to make this problematic.
Still another is one you cite: It's easier to teach that "the type of a new-expression
(and operator new) is a *" today, and that "the type of a gcnew-expression
is a ^".
Finally, a minor reason is that gcnew is slightly less typing than new
(gc) or new (cli), and moderately less typing than new
(stdcli::gc).
|
-
Today the C++/CLI candidate base document was posted, and it's freely
available for download.
This is the spec that Microsoft is contributing to the newly-formed ECMA
TC39/TG5 standards committee for consideration for the C++/CLI standards process.
It covers all the main proposed features, and it gives a pretty thorough look at the
scope and shape of what's being contemplated. There are still places that need to
be filled in, though, as well as some technical decisions that TG5 will need to decide
(in addition to any existing decisions that they may decide to review or change).
Note that this is the last version of the document that will bear a Microsoft copyright,
so we've taken this opportunity to make it publicly available while we still own it.
If ECMA TC39/TG5 adopts this as their base document, it will henceforth be an ECMA
document maintained by that ECMA group. That means it will be up to TG5 to decide
what changes to make and when to make future drafts publicly available. (From my informal
conversations, I wouldn't be surprised if interim drafts were published every three
months or so, but that's just my personal best guess right now. We'll have to wait
and see when the whole group feels the spec is in shape for TG5 to feel ready to distribute
its own first updated snapshot.)
Whew! It's been a long year, a long month, and a long week. Enjoy! And please let
us know what you think of this. Comments are welcome, and those of us on the team
who are blogging (see my Links) will be answering as many as we can get to while
we spend our days continuing to work on the Whidbey product.
I'll probably blog fairly lightly over the next two weeks. Next week is a short week
of course, with the U.S. Thanksgiving holidays closing most offices on Thursday and
Friday. The following week, on Dec 4-5, is the first ECMA TC39/TG5 meeting already,
down in College Station, Texas -- it sure has come up fast. I'll have more to
report after that.
|
-
Nicola Musatti asked the following excellent question:
-
The hat symbol and gcnew could be replaced with a template like syntax, e.g.
cli::handle<R> r = cli::gcnew<R>();
I agree that those are alternatives. Everyone, including me, first pushes hard for
a library-only (or at least library-like) solution when they first start out on this
problem. I think an argument can be made for it, and at one time I did so too.
To me, the killer argument in favor of a new declarator with usage R^ instead
of a library-like cli::handle<R> is its pervasiveness: It will
be by far the most widely used part of all these extensions, as it's the common use
case the vast majority of the time for CLI types (as objects, as parameters, etc.).
This extremely wide use amplifies two particular negative consequences we'd like to
avoid: First, the long spelling (here "handle") could in practice effectively become
a reserved word just because people are liable to widely apply using to
avoid being forced to write the qualification every time (this is worse if the name
chosen is a common name likely to be used for other identifiers or even macros, and
"handle" is a very common name). Second, and worse, the long spelling would also make
the language several times more verbose in a very common case than even the Managed
Extensions syntax was, and that in turn was already verbose compared to other CLI
languages.
Compare five alternatives side by side:
cli::handle<R> r = cli::gcnew<R>(); // 1:
above suggestion
handle<R> r = gcnew<R>(); // 2: ditto, with
"using"s
R __gc* r = new R; // 3: original MC++ syntax
R^ r = gcnew R; // 4: C++/CLI syntax
R r = new R(); // 5: C#/Java syntax
I think you could make a case for any one of these, depending on your tradeoffs. But
I think a tradeoff that favors usability will favor the last few options.
There are also other issues where having ^ and % declarators/operators
that roughly correspond to * and & enables a
more elegant type calculus. I (or someone on the team) will have to write those up
someday, but consider at some future time when we have full mixed types too: When
we can have a type that inherits from both native and CLR base classes/interfaces,
we will want to be able to pass a pointer to such an object to existing ISO C++ APIs
that take a Base1* and a handle to the same object to existing CLI
APIs that take an Base2^. Both will be common operations and therefore
both should be distinctly expressible with a terse syntax:
class NativeBase { };
// a mixed type
ref class R
: public NativeBase
, public System::Windows::Forms::Form
{ };
void NativeFunc( NativeBase* );
void CLIFunc( Object^ );
R r;
// object on the stack
NativeFunc( &r ); // "give me a *" is spelled "&"
as usual
CLIFunc( %r ); // "give me a ^" is spelled "%"
In this way, % is to ^ pretty much just as & is
to *. If R^ were instead spelled using a templatelike
syntax, what would be the corresponding code to get at it?
Finally, consider the agnostic template case:
template<typename T>
void f( T t ) {
SomeBase* b = &t; // I have to have a way of
saying "I want a *" without knowing the type of T
SomeInterface^ i = %t; // I have to have a way
of saying "I want a ^" without knowing the type of T
}
I'll write more about the full pointer system in the future. For other design considerations
about handles I'll point to at Brandon's Behind
the Design: Handles blog entry again, and to my own earlier this week on why
pointers aren't enough by themselves.
|
-
On comp.lang.c++.moderated, Andrew Browne
wrote:
-
The goals of the C++/CLI proposal are good ones, I think, but I wonder if it would
be possible to achieve them without (most of) the new keywords and semantics?
For example instead of:
ref class R {/*...*/}; // CLR reference
type
value class V {/*...*/}; // CLR value type
interface class I {/*...*/}; // CLR interface type
generic <typename T>
ref class G {/*...*/}; // CLR generic
// etc etc
couldn't we have
class R : public System::Object {/*...*/}; // CLR reference type
class V : public System::ValueType {/*...*/}; // CLR value type
class I : public System::Object
{/* pure virtuals only here*/ }; // CLR interface type
template <typename T>
class G : public System::Object {/*...*/}; // CLR generic
// etc etc?
That's one of the alternatives I attempted, and I wasn't the first. I think almost
everyone starts here, and I held on for a while before I became convinced I had to
let go because it wasn't leading to the right places. Let me share some of the problems
and objections that crop up when you work your way down this path:
1. (Minor) Verbose
The above alternative is a lot of typing compared to any of the alternatives (Managed
C++ syntax, proposed C++/CLI syntax, and other CLI languages).
There's a pretty easy solution for this one, using keyword shortcuts:
class R : ref {/*...*/}; // CLR reference type
class V : value {/*...*/}; // CLR value type
class I : interface
{/* pure virtuals only here*/ }; // CLR interface type
An inconvenience with this is that there could already be a class named ref,
and so the syntax would have to be embroided somehow to account for disambiguating
this; this is unfortunate but surmountable. But, more importantly, this shorthand
still doesn't address the other drawbacks, below, of this general approach.
2. Forward declarations
Consider:
class X;
Is this a ref class, value class, interface class, or native class? There are a few
cases where this needs to be known from the forward declaration.
3. Indirect: The header hunt
Consider:
class X : public Y { };
Is this a ref class, value class, interface class, or native class? Under the alternative,
the only way to know would be to inspect Y and all base classes until you can determine
whether any of them directly or indirectly inherit from Object or ValueType (or not).
There are shortcuts (e.g., it's simpler for value types because they're always sealed
and so the inheritance has to be direct), but the hunt remains.
That may not seem like a huge issue, except that the types really are behaviorally
different in small but important ways; for example, in one case a virtual call in
a ctor or dtor will be deep, in the other it will be shallow. What metadata will eventually
be emitted, if any?
4. Closes doors
Speaking specifically to the last part of the example:
-
template <typename T>
class G : public System::Object {/*...*/}; // CLR generic
Unfortunately, this conflates the ideas of the type category (ref/value/native) with
the form of genericity (generic/template). It says that CLI types can only be genericized,
and native types can only be templated, leaving no way to express the other two useful
concepts:
-
a templated CLI type (C++/CLI syntax: template<class T> ref class R
{};)
-
a generic native type (C++/CLI syntax: generic<class T> class N {};)
Templated CLI types in particular are very useful and are supported in C++/CLI, which
lets the template/generic choice and the class category choice vary independently.
5. Other closed doors: Distinguishing mixed types (Future)
In the future, C++/CLI is intended to eventually allow for full mixing and cross-inheritance
of arbitrary types. Using the alternative inheritance-based syntax alone does not
allow the programmer to distinguish between the following two distinct things that
the proposed C++/CLI design lets the programmer express as follows:
ref class Ref : public ANative { int x; };
class Native : public ARef { int x; };
This distinction can't be expressed using the proposed alternative above. Both types
have System::Object as a base class, but one is a reference class
that other CLI languages could use directly and where virtual calls during construction
are deep, and one is a native class that other CLI languages can only use via a handle
or reference to the ARef base class and where virtual calls during
construction are shallow.
|
-
Last week on comp.lang.c++.moderated, Nicola
Musatti wondered why C++/CLI would use keywords that don't follow the __keyword naming
convention for conforming extensions:
-
The standard already provides a way to avoid conflicts when introducing new keywords:
prepend a double underscore.
Right, and that's what Managed C++ used, for just that reason: to respect compatibility.
Unfortunately, there was a lot of resistance and it is considered a failure.
For one thing, programmers have complained loudly that all the underscores are not
only ugly, but a real pain because they're much more common throughout the code than
other extensions such as __declspec have been. In particular, __gc gets
littered throughout the programmer's code.
At least as importantly, the __keywords littered throughout the code
can make the language feel second-class, particularly when people look at equivalent
C++ and C# or VB source code side-by-side. This comparative ugliness has been a contributing,
if not essential, factor why some programmers have left C++ for other languages.
Consider:
//-------------------------------------------------------
// C# code
//
class R {
private int len;
public property int Length {
get() { return len; }
set() { len = value; }
}
};
R r = new R;
r.Length = 42;
//-------------------------------------------------------
// Managed C++ equivalent
//
__gc class R {
int len;
public:
__property int get_Length() { return len; }
__property void set_Length( int i ) { len = i; }
};
R __gc * r = new R;
r.set_Length( 42 );
Oddly, numerous programmers find the former more attractive. Particularly after the
2,000th time they type __gc.
But now we can do better:
//-------------------------------------------------------
// C++/CLI equivalent
//
ref class R {
int len;
public:
property int Length {
int get() { return len; }
void set( int i ) { len = i; }
}
};
R^ r = gcnew R;
r->Length = 42;
I should note there's actually also a shorter form for this common case, to have the
compiler automatically generate the property's getter, setter, and backing store.
While I'm at it, I'll also put the R instance on the stack which
is also a new feature of the revised syntax:
//-------------------------------------------------------
// C++/CLI alternatives
//
ref class R {
public:
property int Length;
};
R r;
r.Length = 42;
C# is adding something similar as a property shorthand. But C# doesn't have stack-based
semantics for reference types and is unlikely to ever have them, though using is
a partial automation of the stack-based lifetime control that C++ programmers take
for granted. I'll have more to say about using another time.
|
-
A few days ago on news:comp.lang.c++.moderated,
Nicola Musatti wrote:
-
As for GC, pure implementations exist.
[that add no new extensions to ISO C++]
Not for a pure definition of "pure," they don't. :-)
To explain why C++ pointers are insufficient (unless their semantics were to be changed
at least a little, which would mean breaking existing code), consider two counterexamples:
1. Not for a compacting GC. Certainly a bald pointer can't point directly to an object
that moves around in memory, because C++ pointers are required to be stable, to always
have the same value while pointing to the same object. Changing the semantics of a
pointer to make it track will break lots of code, starting with set<T*>,
because such tracking pointers cannot be ordered (their values will after all be changed
arbitrarily at unpredictable times by the GC). There are also other restrictions,
but that's one of the most noticeable. [Aside: Such a tracking pointerlike abstraction
is needed, and is provided in C++/CLI. It just can't be spelled * without
fundamentally scuttling ISO C++ conformance, is all.]
2. Not for a non-compacting GC, either. This case can be got a lot closer, but even
Great Circle / Boehm style collectors impose restrictions that break some conforming
C++ programs. In particular, they restrict, if only slightly, the operations that
Standard C++ allows on pointers. Consider the following well-formed ISO C++ program
with well-defined semantics:
int* pi = new int(42); // line 1
pi = (int*)((int)pi ^ 0xaaaaaaaa);
// ... do other work ...
pi = (int*)((int)pi ^ 0xaaaaaaaa);
cout << *pi; // perfectly ok, prints "42",
won't crash
delete pi; // ok
Add-on GCs can't see such disguised pointers, and are liable to reclaim the memory
allocated in line 1 before its later use, resulting in an attempt to access freed
memory. Boom.
This isn't perverse or theoretical, by the way. Consider "two-way pointers" as
one example of a well-known implementation technique where two pointers are XOR'd
together like this for a perfectly reasonable and legal use. In particular, a motivation
behind two-way pointers is that you can have a more space-efficient doubly linked
list if you store only one (not two) pointer's worth of storage in each node. But
how can the list still be traversable in both directions? The idea is that each node
stores, not a pointer to one other node, but a pointer to the previous node XOR'd
with a pointer to the next node. To traverse the list in either direction, at each
node you get a pointer to the next node by simply XORing the current node's two-way
pointer value with the address of the last node you visited, which yields the address
of the next node you want to visit. For more details, see:
"Running Circles Round You, Logically"
by Steve Dewhurst
C/C++ Users Journal (20, 6), June 2002
I don't think the article is available online, alas, but Steve's website has some source
code demonstrating the technique.
This perfectly standards-conforming and useful technique won't work correctly with
any GC implementation I know of that does not extend the language so that pointers
can retain their full standard meaning.
Steve's technique works perfectly fine and unbroken, however, under C++/CLI. It works
because C++/CLI preserves exactly the full semantics of * pointers
without any limitations. To do so, C++/CLI needed to add a new abstraction for GC
semantics instead of pretending that raw pointers are by themselves a complete solution
for safe use in a GC environment (they aren't, only because they were never designed
to be).
For more about the design motivations behind the ^ declarator (aka
a "handle"), see also Brandon Bray's excellent blog entry Behind
the Design: Handles posted earlier today.
|
-
A few days ago on news:comp.lang.c++.moderated,
"Chris" asked:
-
Here is a paranoid question: Is there a possible future step, where compiling
C++ on a Microsoft plaftform becomes impossible _without_ using the CLI binding?
No. Doing that would mean throwing away all the ISO conformance work that Visual C++
just spent nearly the whole last release cycle adding to the product. VC++ is now
98%-ish conformant to C++03 (the 1998 ISO C++ standard + its first technical
corrigendum) and VC++ will continue to work on the remaining 2%, plus track the
coming C++0x additions as they are created by the ISO and ANSI committees.
Of course, the CLI extensions will be needed where programs specifically take advantage
of CLI (i.e., .NET) data types and features, such the types in the .NET Frameworks
libraries, and garbage collection and reflection. But programs that don't need
those can ignore the extensions and compile just fine to either native binaries or to
.NET IL. Note that last bit, because it seems to be not widely known: C++ code
can still be compiled to IL and run in the .NET virtual machine (Common Language Runtime,
or CLR) without using any extensions; the extensions are needed only for additionally
using CLI data types and features like garbage collection.
So there are three major scenarios:
-
Pure native: Compile existing programs to native binaries just like we've
all been doing for years. No CLI features, no CLI extensions.
-
Normal C++ programs that happen to be compiled to IL instead of to x86: The
code runs on the VM and is JITted and everything, but the program is still using all
native data and not using any CLI data types, so no CLI extensions are needed here
either.
-
C++ programs that explicitly start using some CLI data types or features: At
those points in the code where those data types or features are used, and only at
those points, the extensions will apply, and most of the time the only new syntax
will be to write gcnew and ^ (instead of new and *).
Unless you're actually authoring your own new CLI types, you're unlikely to directly
use much more than gcnew and ^, plus maybe
an occasional sprinkling of nullptr or %.
|
-
Welcome! My primary day job these days is that I'm an Architect on the Visual
C++ team at Microsoft, currently responsible for leading the redesign of the C++ Managed
Extensions for .NET (aka "Managed C++"). I also do a fair amount of other C++
writing and speaking (including right now busily writing two
new books due out in the spring), and I chair the ISO C++ standards committee.
You can find out more about me on my website.
At first, I'll mostly use this blog to begin answering frequently asked questions
about the language extensions redesign. The VC++ team has learned a lot about
what worked and what didn't work with the current Managed Extensions for C++ (aka
"Managed C++"). The redesign is an evolution of those extensions but it
isn't being called "Managed C++" any more. The new syntax is about to undergo standardization
in the ECMA and ISO worlds under the name "C++/CLI," a binding from C++ to the CLI, so
I'll often refer to the extensions by that name. I get questions about this every
day or two, and I'll primarily answer them here.
In the meantime, you can find a general overview blurb about this work on my website's Microsoft
page.
|
|
|
|