Holy cow, I wrote a book!
The Win32 COM calling convention specifies the layout of the virtual method table (vtable) of an object. If a language/compiler wants to support COM, it must lay out its object in the specified manner so other components can use it.
It is no coincidence that the Win32 COM object layout matches closely the C++ object layout. Even though COM was originally developed when C was the predominant programming language, the designers saw fit to "play friendly" with the up-and-coming new language C++.
The layout of a COM object is made explicit in the header files for the various interfaces. For example, here's IPersist from objidl.h, after cleaning up some macros.
typedef struct IPersistVtbl { HRESULT ( STDMETHODCALLTYPE *QueryInterface )( IPersist * This, /* [in] */ REFIID riid, /* [iid_is][out] */ void **ppvObject); ULONG ( STDMETHODCALLTYPE *AddRef )( IPersist * This); ULONG ( STDMETHODCALLTYPE *Release )( IPersist * This); HRESULT ( STDMETHODCALLTYPE *GetClassID )( IPersist * This, /* [out] */ CLSID *pClassID); } IPersistVtbl; struct IPersist { const struct IPersistVtbl *lpVtbl; };
This corresponds to the following memory layout:
What does this mean?
A COM interface pointer is a pointer to a structure that consists of just a vtable. The vtable is a structure that contains a bunch of function pointers. Each function in the list takes that interface pointer (p) as its first parameter ("this").
The magic to all this is that since your function gets p as its first parameter, you can "hang" additional stuff onto that vtable:
The functions in the vtable can use offsets relative to the interface pointer to access its other stuff.
If an object implements multiple interfaces but they are all descendants of each other, then a single vtable can be used for all of them. For example, the object above is already set to be used either as an IUnknown or as an IPersist, since IUnknown is a subset of IPersist.
On the other hand, if an object implements multiple interfaces that are not descendants of each other, then you get multiple inheritance, in which case the object is typically laid out in memory like this:
If you are using an interface that comes from the first vtable, then the interface pointer is p. But if you're using an interface that comes from the second vtable, then the interface pointer is q.
Hang onto that diagram, because tomorrow we will learn about those mysterious "adjustor thunks".
The scalar "new" will allocate a single object with no hidden counter. The vector "delete[]" will look for the hidden counter, which isn't there, so it will either crash (accessing nonexistent memory) or grab a random number and attempt to destruct that many items. If the random number is greater than one, you will start corrupting memory after the object. If the random number is zero, you fail to destruct anything. If the random number is exactly one, then the one object is destructed.
Next, the vector "delete[]" will attempt to free the memory block starting one size_t in front of the actual memory block. Depending on how the heap feels today, this may be detected as an invalid parameter and ignored, or this can result in heap corruption.
Final result: not good.
The vector "new[]" allocates several objects and stores the "howmany" in the hidden counter. The scalar "delete" destructs the first object in the vector. If it was a vector of zero objects, you corrupted memory. If it was a vector of two or more objects, then objects 2 an onward will not be destructed. (Result: Memory or other leak.)
Next, the scalar "delete" will free the memory block directly, which will fail because the memory block actually starts at the hidden size_t in front of the vector. This again corrupts the heap since you are freeing memory that is not a valid heap pointer.
Final result: also not good.
If the class does not have a destructor, then no special work needs to be done when the vector is freed aside from freeing the memory. In this case, no hidden counter is necessary; the block can be allocated directly with no overhead and freed with no overhead.
More specifically, if the class has a trivial destructor (none of its base classes or sub-objects - if any - have a destructor), then the scalar and vector new/delete allocate and free the memory the same way, and mixing them does not generate a runtime error. You got lucky.
Of course, somebody might add a destructor to your class tomorrow, and then you won't be so lucky any more.
Note of course that all of this discussion assumes compiler behavior as described yesterday. That behavior is implementation-dependent so you should not rely on it. You may be lucky today, but the next version of the compiler may change the way it manages vectors and your luck will have run out.
Bizarre yet oddly compelling.
(Perhaps if we ask nicely, we can get Marc Miller to tell the story of the time he actually met Philip Glass...)
In a similar vein, a friend of mine directed me to Ishkur's Guide to Electronic Music, an attitude-filled tour of the world of of electronic music.
There is a nice description of C++ memory management in C++ Gotchas: Avoiding Common Problems in Coding and Design on www.informit.com, and I encourage you to read at least the section titled Failure to Distinguish Scalar and Array Allocation before continuing with this entry, because I'm going to use that information as a starting point.
Here's how the Microsoft C++ compiler manages vector allocation. Note that this is internal implementation detail, so it's subject to change at any time, but knowing this may give a better insight into why mixing scalar and vector new/delete is a bad thing.
The important thing to note is that when you do a scalar "delete p", you are telling the compiler, "p points to a single object." The compiler will call the destructor once, namely for the object you are destructing.
When you do "delete[] p", you are saying, "p points to a bunch of objects, but I'm not telling you how many." In this case, the compiler needs to generate extra code to keep track of how many it needs to destruct. This extra information is kept in a "secret place" when the vector is allocated with "new[]".
Let's watch this in action:
class MyClass { public: MyClass(); // constructor ~MyClass(); // destructor int i; }; MyClass *allocate_stuff(int howmany) { return new MyClass[howmany]; }
The generated code for the "allocate_stuff" function looks like this:
push esi mov esi, [esp+8] ; howmany ; eax = howmany * sizeof(MyClass) + sizeof(size_t) lea eax, [esi*4+4] push eax call operator new test eax, eax pop ecx je fail push edi push OFFSET MyClass::MyClass push esi lea edi, [eax+4] ; edi = eax + sizeof(size_t) push 4 ; sizeof(MyClass) push edi mov [eax], esi ; howmany call `vector constructor iterator' mov eax, edi pop edi jmp done fail: xor eax, eax done: pop esi retd 4
Translated back into pseudo-C++, the code looks like this:
MyClass* allocate_stuff(int howmany) { void *p = operator new( howmany * sizeof(MyClass) + sizeof(size_t)); if (p) { size_t* a = reinterpret_cast<size_t*>(p); *a++ = howmany; vector constructor iterator(a, sizeof(MyClass), &MyClass::MyClass); return reinterpret_cast<MyClass*>(a); } return NULL; }
In other words, the memory layout of the vector of MyClass objects looks like this:
The pointer returned by the new[] operator is not the start of the allocated memory but rather points to MyClass[0]. The count of elements is hidden in front of the vector.
The deletion of a vector performs this operation in reverse:
void free_stuff(MyClass* p) { delete[] p; }
mov ecx, [esp+4] ; p test ecx, ecx je skip push 3 call MyClass::`vector deleting destructor` skip ret 4
void free_stuff(MyClass* p) { if (p) p->vector deleting destructor(3); }
void MyClass::vector deleting destructor(int flags) { if (flags & 2) { // if vector destruct size_t* a = reinterpret_cast<size_t*>(this) - 1; size_t howmany = *a; vector destructor iterator(p, sizeof(MyClass), howmany, MyClass::~MyClass); if (flags & 1) { // if delete too operator delete(a); } } else { // else scalar destruct this->~MyClass(); // destruct one if (flags & 1) { // if delete too operator delete(this); } } }
The vector deleting destructor takes some flags. If 2 is set, then a vector is being destructed; otherwise a single object is being destructed. If 1 is set, then the memory is also freed.
In our case, the flags parameter is 3, so we will perform a vector destruct followed by a delete. Observe that this code sucks the original "howmany" value out of its secret hiding place and asks the vector destructor iterator to run the destructor that many times before freeing the memory.
So now, armed with this information, you should be able to describe what happens if you allocate memory with scalar "new" and free it with vector "delete[]" or vice versa.
Bonus exercise: What optimizations can be performed if the destructor MyClass::~MyClass() is removed from the class definition?
Answers to come tomorrow.
The "Copy to Folder" and "Move to Folder" options weren't designed to be on the context menu. They were only meant to be placed in Explorer's toolbar. (Right-click a blank space on your toolbar, select Customize, and pick "Move To" or "Copy To" from the list of available buttons.) If you add them to the context menu, you may notice that the "Copy To" and "Move To" dialogs start showing up when you really aren't expecting them, for example, whenever you double-click an attachment in Outlook.
The reason is that these two items are a bit too eager. When you ask them, "So do you handle the <X> command?" they say, "Oh yes! That's me!" This is fine in a toolbar, where the only time they're asked "Do you handle the <X> command?" is when the user clicks on the button. But in a context menu, you are asked this much more frequently, and with varying values of X.
So when Outlook launches an attachment, the shell loads up the context menu handlers and asks each one, "So do you handle the Open command?" The "Delete" option says, "Nope, sorry." So does "Cut" and "Send to" and "Sharing and Security". But "Copy To" happily says, "Oh yes! That's me!"
And then the Copy To dialog shows up when you don't want it.
Dutch police have arrested 52 people suspected of defrauding gullible Internet users in one of the largest busts of the infamous "Nigerian e-mail" scam.
(I must sheepishly admit that I too mistakenly identified the home of Ikea as Denmark rather than the Netherlands.)
The strings listed in the *.rc file are grouped together in bundles of sixteen. So the first bundle contains strings 0 through 15, the second bundle contains strings 16 through 31, and so on. In general, bundle N contains strings (N-1)*16 through (N-1)*16+15.
The strings in each bundle are stored as counted UNICODE strings, not null-terminated strings. If there are gaps in the numbering, null strings are used. So for example if your string table had only strings 16 and 31, there would be one bundle (number 2), which consists of string 16, fourteen null strings, then string 31.
(Note that this means there is no way to tell the difference between "string 20 is a string that has length zero" and "string 20 doesn't exist".)
The LoadString function is rather limiting in a few ways:
Let's write some functions that remove these limitations.
LPCWSTR FindStringResourceEx(HINSTANCE hinst, UINT uId, UINT langId) { // Convert the string ID into a bundle number LPCWSTR pwsz = NULL; HRSRC hrsrc = FindResourceEx(hinst, RT_STRING, MAKEINTRESOURCE(uId / 16 + 1), langId); if (hrsrc) { HGLOBAL hglob = LoadResource(hinst, hrsrc); if (hglob) { pwsz = reinterpret_cast<LPCWSTR> (LockResource(hglob)); if (pwsz) { // okay now walk the string table for (int i = 0; i < uId & 15; i++) { pwsz += 1 + (UINT)*pwsz; } UnlockResource(pwsz); } FreeResource(hglob); } } return pwsz; }
After converting the string ID into a bundle number, we find the bundle, load it, and lock it. (That's an awful lot of paperwork just to access a resource. It's a throwback to the Windows 3.1 way of managing resources; more on that in a future entry.)
We then walk through the table skipping over the desired number of strings until we find the one we want. The first WCHAR in each string entry is the length of the string, so adding 1 skips over the count and adding the count skips over the string.
When we finish walking, pwsz is left pointing to the counted string.
With this basic function we can create fancier functions.
The function FindStringResource is a simple wrapper that searches for the string in the default thread language.
LPCWSTR FindStringResource(HINSTANCE hinst, UINT uId) { return FindStringResourceEx(hinst, uId, MAKELANGID(LANG_NEUTRAL, SUBLANG_NEUTRAL)); }
The function GetResourceStringLengthEx returns the length of the corresponding string, including the null terminator.
UINT GetStringResourceLengthEx(HINSTANCE hinst, UINT uId, UINT langId) { LPCWSTR pwsz = FindStringResourceEx (hinst, uId, langId); return 1 + (pwsz ? *pwsz : 0); }
And the function AllocStringFromResourceEx loads the entire string resource into a heap-allocated memory block.
LPWSTR AllocStringFromResourceEx(HINSTANCE hinst, UINT uId, UINT langId) { LPCWSTR pwszRes = FindStringResourceEx (hinst, uId, langId); if (!pwszRes) pwszRes = L""; LPWSTR pwsz = new WCHAR[(UINT)*pwszRes+1]; if (pwsz) { pwsz[(UINT)*pwszRes] = L'\0'; CopyMemory(pwsz, pwszRes+1, *pwszRes * sizeof(WCHAR)); } return pwsz; }
(Writing the non-Ex functions GetStringResourceLength and AllocStringFromResource is left as an exercise.)
Note that we must explicitly null-terminate the string since the string in the resource is not null-terminated. Note also that the string returned by AllocStringFromResourceEx must be freed with delete[]. For example:
LPWSTR pwsz = AllocStringFromResource(hinst, uId); if (pwsz) { ... use pwsz ... delete[] pwsz; }
Mismatching vector "new[]" and scalar "delete" is an error I'll talk about in a future entry.
One attack vector he neglects to mention is integer overflow in the new[] operator. This operator performs an implicit multiplication that is unchecked:
int *allocate_integers(int howmany) { return new int[howmany]; }
If you study the code generation for this, it comes out to
mov eax, [esp+4] ; eax = howmany shl eax, 2 ; eax = howmany * sizeof(int) push eax call operator new ; allocate that many bytes pop ecx retd 4
Notice that the multiplication by sizeof(int) is not checked for overflow. Somebody can trick you into under-allocating memory by passing a value like howmany = 0x40000001. For larger structures, multiplication overflow happens sooner.
Let's look at a slightly longer example:
class MyClass { public: MyClass(); // constructor int stuff[256]; }; MyClass *allocate_myclass(int howmany) { return new MyClass[howmany]; }
This class also contains a constructor, so allocating an array of them involves two steps: allocate the memory, then construct each object. The allocate_myclass function compiles to this:
mov eax, [esp+4] ; howmany shl eax, 10 ; howmany * sizeof(MyClass) push esi push eax call operator new ; allocate that many bytes mov esi, eax test esi, esi pop ecx je fail push OFFSET MyClass::MyClass push [esp+12] ; howmany push 1024 ; sizeof(MyClass) push esi ; memory block call `vector constructor iterator` mov eax, esi jmp loop fail: xor eax, eax done: pop esi retd 4
This function does an unchecked multiplication of the size, then tries to allocate that many bytes, then tells the vector constructor iterator to call the constructor (MyClass::MyClass) that many times.
If somebody tricks you into calling allocate_myclass(0x200001), the multiplication overflows and only 1024 bytes are allocated. This allocation succeeds, and then the vector constructor tries to initialize 0x200001 of those items, even though in reality only one of them got allocated. So you walk off the end of the memory block and start corrupting memory.
That's a bad thing.
To protect against this, you can wrap an integer overflow check around the array allocation.
template<typename T> T* NewArray(size_t n) { if (n <= (size_t)-1 / sizeof(T)) return new T[n]; // n is too large - act as if we // ran out of memory return NULL; }
Note: If you use a throwing "new", then replace the "return NULL" with an appropriate throw.
You can now use this template to allocate arrays in an overflow-safe manner.
MyClass *allocate_myclass(int howmany) { return NewArray<MyClass>(howmany); }
This generates the following code:
push edi mov edi, [esp+8] ; howmany cmp edi, 4194303 ; overflow? ja overflow mov eax, edi shl eax, 10 push esi push eax call operator new mov esi, eax test esi, esi pop ecx je failed push OFFSET MyClass::MyClass push edi push 1024 push esi call call `vector constructor iterator` mov eax, esi jmp done failed: xor eax, eax done: pop esi jmp exit overflow: xor eax, eax exit: pop edi retd 4
Notice the new code that checks for a possible integer multiplication overflow.
But how could you get tricked into an overflow situation?
The most common way of doing this is by reading the value out of a file or some other storage location. For example, if your code is parsing a file that has a section whose format is "length followed by data", somebody could intentionally put an overflow-inducing value into the "length" field, then get somebody else to try to load the file.
This is particularly dangerous if the filetype is something that is generally considered "not dangerous", like a JPG.
Even though Ikea was founded by a Swede, its company colors match the Swedish national colors, all its product names are Swedish, and it is clearly associated with Sweden in the minds of everyone, it is in fact headquartered in Denmark. Probably for tax reasons.
The name of the plastic stool Förby has provided me much entertainment. At first, it was amusingly similar to the name of the annoying toy Furby. And it turns out it's also amusingly similar to the Swedish word "förbi" which means "gone past". (German "vorbei".)
During an evening of sore hands trying to drive screws into wood using a tiny little hex wrench, some friends and I came up with new Ikea product names. Here are a few:
And, of course, the complimentary twine for tying your packages to the top of your car:
Some of these might even be better than the actual name they came up with for a children's bed.