|
|
I'm a developer at C++ Shanghai team. I'm interested in everything related to C++
-
What is inline? This keyword is mainly used to ask the compiler to inline substitution of the function body at the point of call. Like 'register', this is only a suggestion to the compiler. Modern compiler can handle inlining using advanced heuristic to get better size / performance balance without the suggestion from the developer. So most of the time, it is unnecessary to use this keyword.
Then what else? Here is the whole story of inline.
Besides the suggestion to the compiler, inline keyword will also have some side effects.
Assuming I have defined a function 'f' in header file and include the header in two different files: a.cpp and b.cpp. Like:
void f() {}
You'll get the following error when compiling (cl a.cpp b.cpp):
b.obj : error LNK2005: "void __cdecl f(void)" (?f@@YAXXZ) already defined in a.obj a.exe : fatal error LNK1169: one or more multiply defined symbols found
However, if you change the definition of f to:
inline void f() {}
The compilation will succeed.
Here are the words from the standard:
Every program shall contain exactly one definition of every non-inline function or object that is used in that program. An inline function shall be defined in every translation unit in which it is used.
This is the fundamental difference between using or not using inline. That means inline functions can be defined multiple times in different translation unit (assuming they have exactly the same definition in every case), but it will also cause the so-called "inline-explosion" because the definition has to be duplicated.
A function defined within a class definition is an inline function. If you don't want to duplicate the definition everywhere, it is better to move the definition into a separate cpp. /LTCG can still inline it if appropriate.
|
-
I’m glad that our team blog is now online (http://blogs.msdn.com/vcshblog/). It is in Chinese and is targeted to Chinese developers. We will write technical articles related to our work in Shanghai and also translate articles on VCBlog into Chinese. Here are articles posted by our team on VCBlog:
|
-
According to C standard, it only supports output text in MBCS (Multi-Byte Character String): n1124.pdf, 7.19.3/12 The wide character output functions convert wide characters to multibyte characters and write them to the stream as if they were written by successive calls to the fputwc function. Each conversion occurs as if by a call to the wcrtomb function, with the conversion state described by the stream’s own mbstate_t object. The byte output functions write characters to the stream as if by successive calls to the fputc function. However, MBCS doesn’t support mixing characters in different code page. For example, you can’t use both Chinese and Japanese. VC provides extension to allow you to output text in Unicode:
#include <cstdio> #include <locale.h>
#include <io.h> #include <fcntl.h>
void OutputMBCS(FILE *f) { // ".936" is the code page for Simplified Chinese // However, you can't use ".1200" (code page for Unicode) to output text in Unicode setlocale(LC_CTYPE, ".936"); // My name in Chinese: "范翔" fwprintf(f, L"%s", L"\x8303\x7FD4");
// The text in the file is encoded in GBK }
void OutputUnicode(FILE *f) { _setmode(_fileno(f), _O_U16TEXT); fwprintf(f, L"%s", L"\x8303\x7FD4");
// The text in the file is encoded in Unicode } For more information, please check the following post: http://blogs.msdn.com/michkap/archive/2008/03/18/8306597.aspx
|
-
-
This is an intellectual exercise: when shifts a 32-bit unsigned integer in C++, how to detect whether the calculation overflows efficiently?
Here is the function prototype. shl_overflow will return true if v << cl overflows (cl is between 0 and 31. And we assume that sizeof(unsigned long) == 4 and sizeof(unsigned long long) == 8).
bool shl_overflow(unsigned long v, int cl)
The most natural way to implement this function is to extend v to 64-bit integer:
bool shl_overflow(unsigned long v, int cl) { unsigned long long vl = v; return (vl << cl >> 32) != 0; }
Now, let’s dig into the assembly world. We’ll limit the discussion on x86.
mov eax, DWORD PTR _v$[esp-4] mov ecx, DWORD PTR _cl$[esp-4] xor edx, edx call __allshl xor eax, eax or eax, edx jne overflow
The implementation has to use three specific registers: eax, edx and ecx. And there is an expensive external function call.
If you step into __allshl in the debugger, you can find that it will use shld to shift 64-bit integer. VC provides some intrinsics which map to CPU instructions. For example, __ll_lshift will map to shld.
Because the high dword of vl is 0, we can simplify our code:
bool shl_overflow(unsigned long v, int cl) { unsigned long long vl = __ll_lshift(v, cl); return (static_cast<unsigned long>(vl >> 32)) != 0; }
The assembly looks like:
mov eax, DWORD PTR _v$[esp-4] mov ecx, DWORD PTR _cl$[esp-4] xor edx, edx shld edx, eax, cl test edx jne overflow
Much better now.
Another approach is based on bit representation.
bool shl_overflow(unsigned long v, int cl) { v = _rotl(v, cl); unsigned long index; return _BitScanForward(&index, v) ? index >= cl : false; }
The idea is simple. If v << cl overflows, that means the most significant cl bits of v should contains "1".
There are two ways to test that.
1. Scan v from the least significant bits to the most, and test the index against 32 – cl. However, we have to handle the case when cl = 0.
2. Rotate v cl bits left first, so the most significant cl bits will be the least significant cl bits. Then we can scan and test the index against cl directly.
Notice that, the scan may fail if v is 0. The second way is simpler and more efficient.
The assembly looks like:
mov ecx, DWORD PTR _cl$[esp-4] mov eax, DWORD PTR _v$[esp-4] rol eax, cl bsf eax, eax je notoverflow cmp eax, ecx jl overflow
It only uses two registers. It can also be extended to handle 64-bit shift. One drawback is an extra conditional jump (The extra jump can be replaced by "cmovz eax, ecx", but there is no way to ask the compiler to generate that)
|
-
Many recursive algorithms have initial parameters. For example, Fibonacci Number is defined as: Fn = Fn-1 + Fn-2, with F1 = F2 = 1.
By giving different values to F1 and F2, we can generate different sequence of numbers.
1. If we implement the algorithm using functions, we have to either define these parameters as global variables or pass them in each recursive iteration.
int Fib(int n, int f1, int f2) { if (n < 1) return 0;
if (n >= 3) { return Fib(n - 1, f1, f2) + Fib(n - 2, f1, f2); } else if (n == 2) { return f2; } else { return f1; } }
2. Recursive functor can store the initial parameters as the class data member. Here is the example.
struct FibFunctor { FibFunctor(int f1, int f2) : m_f1(f1), m_f2(f2) {} int m_f1, m_f2;
int operator()(int n) const { if (n < 1) return 0;
if (n >= 3) { return (*this)(n - 1) + (*this)(n - 2); } else if (n == 2) { return m_f2; } else { return m_f1; } } };
int Fib(int n, int f1, int f2) { FibFunctor f(f1, f2); return f(n); }
3. Lambda is a new core language feature introduced in C++0x to define implicit function object. However, "this" is not valid inside lambda expression.
This post Visual C++ Team Blog - Stupid Lambda Tricks shows how to write a recursive lambda. So the above functor can be rewritten as:
int Fib(int n, int f1, int f2) { auto f = [&](int n) -> int { if (n < 1) return 0;
if (n >= 3) { return f(n - 1) + f(n - 2); } else if (n == 2) { return f2; } else { return f1; } }; return f(n); }
Here, [&] is equivalent to [&f, &f1, &f2].
(According to C++03 3.3.1/1: "The point of declaration for a name is immediately after its complete declarator and before its initializer (if any).", it is legal to capture f in the lambda expression. That means:
struct A { A(A*); };
int main() { A a = A(&a); // OK })
|
-
NOTICE: The technique describes in the article may not be supported in future release of VC. You should not use it in production code
There are two kinds of initialization in C++: static initialization and dynamic initialization.
According to the standard, static initialization shall always be performed before any dynamic initialization takes place.
In VC, static initialization is done at compile time. The value is stored in the data section of the generated executable. On the other hand, dynamic initialization happens at runtime. Before entering main, CRT will call the dynamic initializers of the global variables.
Sometimes you may need to measure the startup time of your program. However, dynamic initialization happens before main, so your measurement will not include it.
This article "CRT Initialization" describes the detailed information of how dynamic initialization works in VC.
VC provides pragma init_seg to fine-control the initialization process. It will place the dynamic initializers in the specific section. Besides predefined compiler, lib and user (the corresponding section names are ".CRT$XCC", ".CRT$XCL" and ".CRT$XCU"), you can also specify the section name explicitly.
With these knowledge, we can use the following code to measure the initialization time of global variables.
Let’s create two files: InitTime_Start.cpp and InitTime_End.cpp (we have to use two files because one file can only have one init_seg)
//InitTime_Start.cpp
#pragma warning(disable : 4075)
#pragma init_seg(".CRT$XCB")
Timer gInitTimer;
//InitTime_End.cpp
#pragma warning(disable : 4075)
#pragma init_seg(".CRT$XCY")
double gInitTime = gInitTimer.GetTime();
Here, "Timer" is a class which will start timing in ctor and return the elapsed time in member function GetTime. Because ".CRT$XCB" will be placed before ".CRT$XCC", the dynamic initializer of the timer will be called before any dynamic initializers of compiler generated global variables. Similarly, gInitTimer.GetTime will be called after all the dynamic initializers of user defined global variables. Then "gInitTime" will contain the initialization time of all global variables including those generated by compiler, library and user.
|
-
Matrix multiplication is common and the algorithm is easy to implementation. Here is one example:
Version 1:
template<typename T> void SeqMatrixMult1(int size, T** m1, T** m2, T** result) { for (int i = 0; i < size; i++) { for (int j = 0; j < size; j++) { result[i][j] = 0; for (int k = 0; k < size; k++) { result[i][j] += m1[i][k] * m2[k][j]; } } } }
This implementation is straight-forward and you can find it in text book and many online samples.
Version 2:
template<typename T> void SeqMatrixMult2(int size, T** m1, T** m2, T** result) { for (int i = 0; i < size; i++) { for (int j = 0; j < size; j++) { T c = 0; for (int k = 0; k < size; k++) { c += m1[i][k] * m2[k][j]; } result[i][j] = c; } } }
This version will use a temporary to store the intermediate result. So we can save a lot of unnecessary memory write. Notice that the optimizer can not help here because it doesn't know whether "result" is an alias of "m1" or "m2".
Version 3:
template<typename T> void Transpose(int size, T** m) { for (int i = 0; i < size; i++) { for (int j = i + 1; j < size; j++) { std::swap(m[i][j], m[j][i]); } } } template<typename T> void SeqMatrixMult3(int size, T** m1, T** m2, T** result) { Transpose(size, m2); for (int i = 0; i < size; i++) { for (int j = 0; j < size; j++) { T c = 0; for (int k = 0; k < size; k++) { c += m1[i][k] * m2[j][k]; } result[i][j] = c; } } Transpose(size, m2); }
This optimization is tricky. If you profile the function, you'll find a lot of data cache miss. We transpose the matrix so that both m1[i] and m2[i] can be accessed sequentially. This can greatly improve the memory read performance.
Version 4:
template<typename T> void SeqMatrixMult4(int size, T** m1, T** m2, T** result); // assume size % 2 == 0 // assume m1[i] and m2[i] are 16-byte aligned // require SSE3 (haddpd) template<> void SeqMatrixMult4(int size, double** m1, double** m2, double** result) { Transpose(size, m2); for (int i = 0; i < size; i++) { for (int j = 0; j < size; j++) { __m128d c = _mm_setzero_pd();
for (int k = 0; k < size; k += 2) { c = _mm_add_pd(c, _mm_mul_pd(_mm_load_pd(&m1[i][k]), _mm_load_pd(&m2[j][k]))); } c = _mm_hadd_pd(c, c); _mm_store_sd(&result[i][j], c); } } Transpose(size, m2); } // assume size % 4 == 0 // assume m1[i] and m2[i] are 16-byte aligned // require SSE3 (haddps) template<> void SeqMatrixMult4(int size, float** m1, float** m2, float** result) { Transpose(size, m2); for (int i = 0; i < size; i++) { for (int j = 0; j < size; j++) { __m128 c = _mm_setzero_ps();
for (int k = 0; k < size; k += 4) { c = _mm_add_ps(c, _mm_mul_ps(_mm_load_ps(&m1[i][k]), _mm_load_ps(&m2[j][k]))); } c = _mm_hadd_ps(c, c); c = _mm_hadd_ps(c, c); _mm_store_ss(&result[i][j], c); } } Transpose(size, m2); }
For float types, we can use SIMD instruction set to parallel the data processing.
Parallel version using PPL (Parallel Patterns Library) and lambda in VC2010 CTP:
template<typename T> void ParMatrixMult1(int size, T** m1, T** m2, T** result) { using namespace Concurrency; for (int i = 0; i < size; i++) { parallel_for(0, size, 1, [&](int j) { result[i][j] = 0; for (int k = 0; k < size; k++) { result[i][j] += m1[i][k] * m2[k][j]; } }); } }
Result
Here are the test results (what really matters is the relative time between different version):
Matrix size = 500 (Intel Core 2 Duo T7250, 2 cores, L2 cache 2MB)
| |
int |
long long |
float |
double |
| Version 1 |
0.931119s |
2.945134s |
0.774894s |
0.984585s |
| Version 2 |
0.571003s |
2.310568s |
0.724161s |
0.929064s |
| Version 3 |
0.239538s |
0.823095s |
0.570772s |
0.241691s |
| Version 4 |
N/A |
N/A |
0.063196s |
0.187614s |
| Version 1 + PPL |
0.847534s |
1.683765s |
0.589513s |
0.994161s |
| Version 2 + PPL |
0.380174s |
1.190713s |
0.409321s |
0.594859s |
| Version 3 + PPL |
0.135760s |
0.495152s |
0.370499s |
0.185800s |
| Version 4 + PPL |
N/A |
N/A |
0.041959s |
0.157932s |
Matrix size = 500 (Intel Xeon E5430, 4 cores, L2 cache 12MB)
| |
int |
long long |
float |
double |
| Version 1 |
0.514330s |
1.434509s |
0.455168s |
0.608127s |
| Version 2 |
0.314554s |
1.231696s |
0.447607s |
0.593517s |
| Version 3 |
0.180176s |
0.591002s |
0.432129s |
0.149511s |
| Version 4 |
N/A |
N/A |
0.042900s |
0.083286s |
| Version 1 + PPL |
0.308766s |
0.482934s |
0.175585s |
0.309159s |
| Version 2 + PPL |
0.105717s |
0.325413s |
0.124862s |
0.164156s |
| Version 3 + PPL |
0.073418s |
0.193824s |
0.116971s |
0.061268s |
| Version 4 + PPL |
N/A |
N/A |
0.017891s |
0.031734s |
From the results, you can find that:
- Parallelism only helps if you carefully tune your code to maximize its effect (Version 1)
- Eliminating unnecessary memory write (Version 2) helps the parallelism
- Data cache miss can be a big issue when there are lots of memory access (Version 3)
- Using SIMD instead of FPU on aligned data is beneficial (Version 4)
- Different data types, data sizes and host architectures may have different kinds of bottlenecks
|
-
Object slicing often happens when you pass the object by value. Compiler will do implicitly conversion from derived to base for you without any warning message.
If you want to detect object slicing, you're on your own. However, template can help you.
Because object slicing will call copy constructor of base class, what you can do is to "hijack" it. The magic looks like:
#include <type_traits> #include "boost\static_assert.hpp"
template<bool> struct SliceHelper { }; template<> struct SliceHelper<false> { typedef void type; }; #define DETECTSLICE(NAME,SIZE)\ enum {_SizeOfClass=SIZE};\ void _SizeValidation() {BOOST_STATIC_ASSERT(sizeof(NAME)==_SizeOfClass);}\ template <typename T> NAME(const T &,typename SliceHelper<sizeof(T)==_SizeOfClass || !std::tr1::is_base_of<NAME,T>::value>::type * = 0)\ {\ typedef typename T::sliced type;\ } struct A { int a; A() {} DETECTSLICE(A,4) }; struct B:public A { }; struct C:public A { int b; }; struct D { int a; }; struct E { int a,b; }; void f(A) {} int main() { B b; C c; D d; E e; A a; A a0(a); A a1(b); //A a2(c); //error //A a3(d); //error //A a4(e); //error f(a); f(b); //f(c); //error //f(d); //error //f(e); //error }
Notice:
1. The template constructor is not a copy constructor. According to the standard, copy constructor should be non-template. But the template constructor can still be chosen to copy construct the object :-)
2. You can not use sizeof in the member function declaration because the class is incomplete at that point. You can only use sizeof inside the member function definition. That is why we have to specify the size of the class explicitly.
|
-
C++0x will provide a full set of type traits helpers to ease generic programming. However, there is no support for the detection of class members. The general problem is hard. Here we will try to tackle the more specific version: detecting the class member with given name and type.
In C++, function overload is one of the most widely used technique to implement type traits. However, function overload only cares about types. Default argument and access modifier are only considered after the overload resolution. What we want here is to find out whether the specific member exists. So we have to turn the member into the type. Fortunately, template supports non-type argument. And here is the magic:
namespace van { namespace type_traits { namespace detail { typedef char Small; struct Big {char dummy[2];};
template<typename Type,Type Ptr> struct MemberHelperClass;
template<typename T,typename Type> Small MemberHelper_f(MemberHelperClass<Type,&T::f> *); template<typename T,typename Type> Big MemberHelper_f(...); }
template<typename T,typename Type> struct has_member_f { enum {value=sizeof(detail::MemberHelper_f<T,Type>(0))==sizeof(detail::Small)}; }; } }
struct A { static void f(); }; struct B { };
#include <iostream> using namespace std;
int main() { cout<<boolalpha; cout<<van::type_traits::has_member_f<A,void (*)()>::value<<endl; cout<<van::type_traits::has_member_f<B,void (*)()>::value<<endl; }
If the member "f" is missing, the non-type (&T::f) to type (MemberHelperClass) conversion will be invalid, so the va-arg version will be chosen. Otherwise, the former will be chosen because the va-arg version is always the least preferable. Then has_member_f will distinguish these two cases by checking the size of the return value of the chosen MemberHelper_f function. The above code supports both static members and non-static members. It also supports both data members and function members. However, it has one drawback. If the detected member is non-public, there will be compiler error. That is because access control is applied after the overload resolution.
Because the member name itself cannot be used as a template argument, we have to use it explicitly in our helper. To prevent the redundant work, we can take advantage of macro to get a more general version:
#define DEFINEHASMEMBER(Name)\ namespace van {\ namespace type_traits {\ namespace detail {\ template<typename T,typename Type>\ Small MemberHelper_##Name(MemberHelperClass<Type,&T::Name> *);\ template<typename T,typename Type>\ Big MemberHelper_##Name(...);\ }\ \ template<typename T,typename Type>\ struct has_member_##Name\ {\ enum {value=sizeof(detail::MemberHelper_##Name<T,Type>(0))==sizeof(detail::Small)};\ };\ }\ }
One usage of this type trait is to simplify dispatcher. For example, we want to provide different implementation for different architecture to get better performance. Instead of dispatch the code manually, we can automate the work using the member detection trait.
First, we group the different implementations into one helper class.
struct MemoryCopyHelper { typedef void (*FunctionType)(const void *lpDest, void *lpSrc, size_t n); static void Default(const void *lpDest, void *lpSrc, size_t n){} static void MMX(const void *lpDest, void *lpSrc, size_t n){} };
Second, we create the array to store the address of each implementation. If the implementation for some architecture is missing, we can use the default one instead (assume the default one is always present).
DEFINEHASMEMBER(Default) DEFINEHASMEMBER(MMX) DEFINEHASMEMBER(SSE2)
#define DEFINESELECTSTATICMEMBER(MemberName)\ template<typename T,typename FunType,bool = van::type_traits::has_member_##MemberName<T,FunType>::value>\ struct select_member_##MemberName;\ template<typename T,typename FunType>\ struct select_member_##MemberName<T,FunType,true> {static const FunType value;};\ template<typename T,typename FunType>\ struct select_member_##MemberName<T,FunType,false> {static const FunType value;};\ template<typename T,typename FunType>\ const FunType select_member_##MemberName<T,FunType,true>::value=&T::MemberName;\ template<typename T,typename FunType>\ const FunType select_member_##MemberName<T,FunType,false>::value=&T::Default;
DEFINESELECTSTATICMEMBER(Default) DEFINESELECTSTATICMEMBER(MMX) DEFINESELECTSTATICMEMBER(SSE2)
MemoryCopyHelper::FunctionType gDispatchArray_MemoryCopy[]={ select_member_Default<MemoryCopyHelper, MemoryCopyHelper::FunctionType>::value, select_member_MMX<MemoryCopyHelper, MemoryCopyHelper::FunctionType>::value, select_member_SSE2<MemoryCopyHelper, MemoryCopyHelper::FunctionType>::value, };
Then you can focus on the implementation. You can update the helper class to add the optimized version for the missing architecture in the future. The array will be automatically updated. (Notice: the above array will be initialized dynamically before entering into main)
BTW: The above code may fail on some old compilers. It is OK with VC8, VC9, gcc 3.4.5.
|
-
C++0x will be released in the near future. Do you know the changes of standard library? Here is a list of changes that I've collected (minor behavior changes and changes related to concepts are not included). 1. New stuff system_error: new header array, vector, deque, list, string, map, set, unordered_map, unordered_set: new member function: cbegin, cend, crbegin, crend vector, deque, string: new member function: shrink_to_fit map, unordered_map: new member function: at type_traits: new traits: is_lvalue_reference, is_rvalue_reference has_trivial_default_constructor, has_trivial_copy_constructor, has_nothrow_default_constructor, has_nothrow_copy_constructor string: new type: u16string, u32string new function: stoi, stol, stoul, stoll, stoull, stof, stod, stold to_string, to_wstring algorithm: new function: all_of, any_of, none_of, find_if_not, copy_if, partition_copy, is_partitioned, partition_point minmax_element is_heap_until, is_heap is_sorted_until, is_sorted next,prev min,max,minmax copy_n random: new class: seed_seq and many new distributions new function: generate_canonical new type: ranlux24_base, ranlux48_base, ranlux24, ranlux48, knuth_b default_random_engine For classes linear_congruential, subtract_with_carry, mersenne_twister, discard_block, xor_combine: add ctor which accepts seed_seq, member function "seed" and the corresponding _engine class memory: new class: default_delete,unique_ptr new function: allocate_shared,make_shared functional: new functor: bit_and, bit_or, bit_xor numeric: new function: iota iomanip: new function: get_money, put_money get_time, put_time ios: new function: defaultfloat iosfwd: new type: char_traits<char16_t>, char_traits<char32_t> limits: new member function: max_digits10, lowest 2. From tr1 array unordered_map unordered_set regex random type_traits tuple functional utility get ios hexfloat 3. Other set, map, unordered_map, unordered_set: member function "erase" will have return value (compatible with sequential container) string: pop_back, front, back (compatible with vector) bitset: ctor: unsigned long -> unsigned long long fstream: ctor/open accept string & wstring as the type of filename (C++03 only supports raw string pointer)
|
-
Unfortunately, VS2008 SP1 doesn't recognize C++ tr1 headers. That means there are no syntax highlighting and no intellisense for these files.
This is a bug, but you can fix it by yourself. The trick is in the registry. VS maintains a list of extensionless files, and will treat them as cpp files.
It is under: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\VisualStudio\9.0\Languages\Extensionless Files\{B2F072B0-ABC1-11D0-9D62-00C04FD9DFD9}
Just add tr1 headers into it and you'll now get full support of new tr1 features. Here is the reg file:
Windows Registry Editor Version 5.00
[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\VisualStudio\9.0\Languages\Extensionless Files\{B2F072B0-ABC1-11D0-9D62-00C04FD9DFD9}] "array"="" "random"="" "regex"="" "tuple"="" "type_traits"="" "unordered_map"="" "unordered_set"="" "xawrap"="" "xawrap0"="" "xawrap1"="" "xawrap2"="" "xfwrap"="" "xfwrap1"="" "xrefwrap"="" "xtr1common"="" "xxbind0"="" "xxbind1"="" "xxcallfun"="" "xxcallobj"="" "xxcallpmf"="" "xxcallwrap"="" "xxfunction"="" 'xxmem_fn"="" "xxpmfcaller"="" "xxrefwrap"="" "xxresult"="" "xxtuple0"="" "xxtuple1"="" "xxtype_traits"=""
|
-
MSDN has a page describing various VC extensions. But it is far from complete.
I've collected a list of nonstandard extensions provided by VC, some of them are evil. If you want to write standard conformant C++ code, you'd better be aware of these extensions which are on by default. Some commonly (mis-)used extensions are in bold.
W4001: nonstandard extension 'single line comment' was used W4152: nonstandard extension, function/data pointer conversion in expression W4200: nonstandard extension used : zero-sized array in struct/union W4201: nonstandard extension used : nameless struct/union W4202: nonstandard extension used : '...': prototype parameter in name list illegal W4203: nonstandard extension used : union with static member variable W4204: nonstandard extension used : non-constant aggregate initializer W4205: nonstandard extension used : static function declaration in function scope W4206: nonstandard extension used : translation unit is empty W4207: nonstandard extension used : extended initializer form W4208: nonstandard extension used : delete [exp] - exp evaluated but ignored W4210: nonstandard extension used : function given file scope W4211: nonstandard extension used : redefined extern to static W4212: nonstandard extension used : function declaration used ellipsis W4213: nonstandard extension used : cast on l-value W4214: nonstandard extension used : bit field types other than int W4215: nonstandard extension used : long float W4216: nonstandard extension used : float long W4218: nonstandard extension used : must specify at least a storage class or a type W4221: nonstandard extension used : 'identifier' : cannot be initialized using address of automatic variable W4223: nonstandard extension used : non-lvalue array converted to pointer W4224: nonstandard extension used : formal parameter 'identifier' was previously defined as a type W4226: nonstandard extension used : 'keyword' is an obsolete keyword W4228: nonstandard extension used : qualifiers after comma in declarator list are ignored W4231: nonstandard extension used : 'identifier' before template explicit instantiation W4232: nonstandard extension used : 'identifier' : address of dllimport 'dllimport' is not static, identity not guaranteed W4233: nonstandard extension used : 'keyword' keyword only supported in C++, not C W4234: nonstandard extension used : 'keyword' keyword reserved for future use W4235: nonstandard extension used : 'keyword' keyword not supported on this architecture W4238: nonstandard extension used : class rvalue used as lvalue W4239: nonstandard extension used : 'token' : conversion from 'type' to 'type' W4240: nonstandard extension used : access to 'classname' now defined to be 'access specifier', previously it was defined to be 'access specifier' W4288: nonstandard extension used : 'var' : loop control variable declared in the for-loop is used outside the for-loop scope; it conflicts with the declaration in the outer scope W4289: nonstandard extension used : 'var' : loop control variable declared in the for-loop is used outside the for-loop scope W4353: nonstandard extension used: constant 0 as function expression. Use '__noop' function intrinsic instead W4480: nonstandard extension used: specifying underlying type for enum 'enum' W4481: nonstandard extension used: override specifier 'keyword' W4482: nonstandard extension used: enum 'enum' used in qualified name W4509: nonstandard extension used: 'function' uses SEH and 'object' has destructor W4836: nonstandard extension used : 'type' : local types or unnamed types cannot be used as template arguments
C2599: 'enum' : forward declaration of enum type is not allowed
|
-
In C++, it is well-known that the data in the vector is contiguous. To be more specific, here is the quotation from the standard (C++03, 23.2.4/1)
The elements of a vector are stored contiguously, meaning that if v is a vector<T, Allocator> where T is some type other than bool, then it obeys the identity &v[n] == &v[0] + n for all 0 <= n < v.size().
There're two points. First, vector<bool> is special. It is optimized for size, and the bools are packed. Second, &v[0] is only valid if v.size() > 0.
(BTW: The above statement doesn't exist in C++98. Here is the link to the history: http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-defects.html#69)
vector is designed to be an advanced version of raw array, so the guarantee of contiguous is convenient when raw pointer is needed to pass to low level API. We can simply use v.empty()?NULL:&v[0] (if you use default allocator which implies that the return value of "operator []" is a real reference, not a proxy).
In C++0x, it adds "data" member function for the similar purpose.
In contrast, other containers in STL don't store data contiguously.
string is a little different. It is not a STL container and standard doesn't explicitly say whether the data is contiguous. But it also provides "data" member function which returns "const charT *". Here is the definition in C++03:
If size() is nonzero, the member returns a pointer to the initial element of an array whose first size() elements equal the corresponding elements of the string controlled by *this The program shall not alter any of the values stored in the array.
On the other hand, it is also said that for "operator []"
If pos < size(), returns data()[pos].
Because "operator []" return reference, for string s, we have &s[0] == data(). However, data() is not modifiable according to the bold text above. It is also confusing that we can get non-const reference from data().
There is a issue about whether string data is contiguous: http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-defects.html#530
Fortunately, it will be explicitly stated in C++0x, see n2798.pdf 21.3.1/3:
The char-like objects in a basic_string object shall be stored contiguously. That is, for any basic_string object s, the identity &*(s.begin() + n) == &*s.begin() + n shall hold for all values of n such that 0 <= n < s.size().
The C++0x draft also changes "operator []", so it no longer relies on "data" (in fact, "data" will have the same behavior of c_str in C++0x):
If pos < size(), returns *(begin() + pos).
So, in C++0x, the data of string is guaranteed to be contiguous, and you can pass &s[0] (if s.size() > 0 and you use default allocator) to low level API like vector. Cheers!
|
-
As the designer of base class, you may hesitate whether to use private or protect access control. Then, let's try the following examples:
1. Call protected member function
#include <cstdio>
class A { protected: void b() {printf("Oops!\n");} };
void f(A* a) { class A_hack:public A { friend void f(A*); }; static_cast<A_hack *>(a)->b(); }
class B { public: void f(A* a) { class A_hack:public A { friend B; }; static_cast<A_hack *>(a)->b(); } };
int main() { f(NULL); B().f(NULL); }
Although the result of the cast is undefined as stated in the standard, if no this pointer adjustment happens and the layout of A is the same as A_hack which are normally the case, the code will break the access control.
For the evil user, his purpose is archieved.
2. Call pure virtual function
class A { protected: virtual void Fun() =0; };
class B:public A { public: B() {Dummy();}
private: void Dummy() {Fun();} }; class C:public B { public: virtual void Fun() {} };
int main() { C c; }
Do you ever wonder how to call pure virtual function? Do you like to see what does _purecall in VC do? Just try the above code.
The problem here is because of the protect access of "Fun" in base class. Of course, you should not call virtual function in ctor (they will not have polymorphic behavior), but when you call Dummy in ctor, you may not realize the fact that Dummy will call virtual function internally. Then the disaster happens.
If what you want by using virtual is to allow the derived class to provide some specific behavior, you'd better declare the function as private.
Although both the above code are nonconformant, they at least show the possibility that your user can do more than what you expect for protect.
If you don't even have confidence with private (Like evil #define private public which definitely violates One Definition Rule), you'd better use the PImpl Idiom to implement your interface.
|
|
|
|