Holy cow, I wrote a book!
It is common to have a critical section which protects against concurrent writes to a variable or collection of variables. But if you protect a write with a critical section, you may also want to protect the read, because the read might race against the write.
Adam Rosenfield shared his experience with this issue in a comment from a few years back. I'll reproduce the example here in part to save you the trouble of clicking, but also to make this entry look longer and consequently make it seem like I'm actually doing some work (when in fact Adam did nearly all of the work):
class X { volatile int mState; CRITICAL_SECTION mCrit; HANDLE mEvent; }; X::Wait() { while(mState != kDone) { WaitForSingleObject(mEvent, INFINITE); } } X::~X() { DestroyCriticalSection(&mCrit); } X::SetState(int state) { EnterCriticalSection(&mCrit); // do some state logic mState = state; SetEvent(mEvent); LeaveCriticalSection(&mCrit); } Thread1() { X x; ... spawn off thread 2 ... x.Wait(); } Thread2(X* px) { ... px->SetState(kDone); ... }
There is a race condition here:
X::Wait
X::SetState
SetEvent
WaitForSingleObject
mState == kDone
X
The fix was to enclose the read of mState inside a critical section:
mState
X::Wait() { while(1) { EnterCriticalSection(&mCrit); int state = mState; LeaveCriticalSection(&mCrit); if(state == kDone) break; WaitForSingleObject(mEvent, INFINITE); } }
Forgetting to enclose the read inside a critical section is a common oversight. I've made it myself more than once. You say to yourself, "I don't need a critical section here. I'm just reading a value which can safely be read atomically." But you forget that the critical section isn't just for protecting the write to the variable; it's also to protect all the other actions that take place under the critical section.
And just to make it so I actually did some work today, I leave you with this puzzle based on an actual customer problem:
class BufferPool { public: BufferPool() { ... } ~BufferPool() { ... } Buffer *GetBuffer() { Buffer *pBuffer = FindFreeBuffer(); if (pBuffer) { pBuffer->mIsFree = false; } return pBuffer; } void ReturnBuffer(Buffer *pBuffer) { pBuffer->mIsFree = true; } private: Buffer *FindFreeBuffer() { EnterCriticalSection(&mCrit); Buffer *pBuffer = NULL; for (int i = 0; i < 8; i++) { if (mBuffers[i].mIsFree) { pBuffer = &mBuffers[i]; break; } } LeaveCriticalSection(&mCrit); return pBuffer; } private: CRITICAL_SECTION mCrit; Buffer mBuffers[8]; };
The real class was significantly more complicated than this, but I've distilled the problem to its essence.
The customer added, "I tried declaring mIsFree as a volatile variable, but that didn't seem to help."
mIsFree
In a technical discussion, I opened a reply with
Bob's paper which I haven't yet read points out that...
Some people wrote to me to say that the've added this quote to their file in the hopes of being able to use it themselves someday.
For those who are curious, I found the technical discussion in question. It had to do with whether the following code is thread-safe:
// initial conditions int x = 1, y = 0; int *p = &x; // Thread 1 executes this y = 1; MemoryBarrier(); p = &y; // Thread 2 executes this print *p;
Question: Can this code legitimately print 0?
0
Surprisingly, the answer is yes!
Commenter Ivo wants to know if the RegisterClipboardFormat function returns a UINT, why is the CLIPFORMAT data type defined to be a WORD? Since a WORD is smaller than a UINT, you have to stick in a cast every time you assign the result of RegisterClipboardFormat to a CLIPFORMAT.
RegisterClipboardFormat
UINT
CLIPFORMAT
WORD
Rewind to 16-bit Windows. Back in those days, a UINT and a WORD were the same size, namely, 16 bits. As a result, people got lazy about the distinction. Six of one, a half dozen of the other. (People are lazy about this sort of distinction even today, assuming for example that UINT and DWORD are the same size, and in turn forcing UINT to remain a 32-bit integer type even on 64-bit Windows.) The RegisterClipboardFormat function came first, and when the OLE folks wanted to define a friendly name for the data type to hold a clipboard format, they said, "Well, a clipboard format is a 16-bit integer, so let me use a 16-bit integer." A WORD is a 16-bit integer, so there you go.
DWORD
This mismatch had no effect in 16-bit code, but once Win32 showed up, you had a problem since 32-bit Windows expanded the UINT type to 32 bits. Not only does keeping a CLIPFORMAT in a WORD create the need for all this casting, it also leaves two bytes of padding in the FORMATETC structure. Strike two.
FORMATETC
Yeah, basically, it sucks.
Just a quick tip today.
If you need to insert a large number of items into a treeview, like tens of thousands, then it's much more efficient to insert them "backwards". (I'm ignoring for now the usability question of having a treeview that large in the first place.) In other words, instead of
for (i = 0; i < array.Count(); i++) { TVINSERTSTRUCT tvis; tvis.hParent = hParentNode; tvis.hInsertAfter = TVIF_LAST; tvis.item.mask = TVIF_TEXT; item.item.pszText = array[i].Text(); TreeView_InsertItem(hwndTreeView, &tvis); }
for (i = array.Count() - 1; i >= 0; i--) { TVINSERTSTRUCT tvis; tvis.hParent = hParentNode; tvis.hInsertAfter = TVIF_FIRST; tvis.item.mask = TVIF_TEXT; item.item.pszText = array[i].Text(); TreeView_InsertItem(hwndTreeView, &tvis); }
Why is backwards-insertion faster?
It has to do with the annoying hInsertAfter parameter. To validate that the hInsertAfter parameter is valid, the treeview needs to verify that the hInsertAfter is a valid child of the hParent, and this is done by walking the parent's children looking for a match. The sooner you find the match, the faster the validation completes. (And if you pass TVI_LAST, then the treeview needs to walk to the end of the child list.)
hInsertAfter
hParent
TVI_LAST
You'd think that you could verify the parent/child relationship by just doing a TreeView_GetParent(hInsertAfter), but that turns out not to be strict enough, because hInsertAfter might itself not be a valid parameter. If hInsertAfter is a bogus value, then you may crash when you try to read its Parent property. That's if you're lucky. If you're not lucky, then the random memory that hInsertAfter points to might look just enough like a valid HTREEITEM that you end up inserting the new node after a completely bogus node, and now the treeview has become corrupted.
TreeView_GetParent(hInsertAfter)
HTREEITEM
Sure, you got the same problem if you passed a garbage HTREEITEM to TreeView_GetParent, but in that case, it's just garbage in garbage out. Nothing gets corrupted; the application just gets a garbage result. But in the case of TreeView_InsertItem, the treeview is going to update its internal data structures based on the garbage you passed in, and that means that the treeview winds up corrupted.
TreeView_GetParent
TreeView_InsertItem
To ensure that the value passed in is valid, the treeview checks it against the list of valid values for hInsertAfter. And therefore, you get better performance if the valid value you pass is the one that it checks first.
(As it happens, a lot of programs pass garbage for hInsertAfter, so this defensive validation step is absolutely necessary.)
You might say that the treeview could have a one-off optimization for the special TVI_LAST value by remembering the last child so it can be located quickly. The question is whether the complexity of adding that optimization is worth the cost: Any tree rearranging function would have to update the cached value, and if you miss a spot, then the treeview ends up corrupted. That's a pretty high-risk optimization you're suggesting there.
And think about it this way: You're worrying about a tree where a single node has tens of thousands of children, something which (I am no longer ignoring) is a horrible user interface. That's like a list box with tens of thousands of items, or a dialog box with tens of thousands of checkboxes. You'll probably want to consider a better way of presenting the information than in a tree view that goes on for hundreds of screens. The treeview isn't optimized for this case because you don't optimize for the case where somebody is mis-using your system.
A customer observed that starting in Windows Vista, the deadline for responding to the PBT_APMSUSPEND message was reduced from twenty seconds to two seconds. Their program needs more than two seconds to prepare for a suspend and they wanted to know how they could extend the deadline. The program communicates with a device, and if they don't properly prepare the device for suspend, it has problems when the system resumes.
PBT_APMSUSPEND
No, you cannot extend the deadline for responding to the PBT_APMSUSPEND message. The two second deadline is hard-coded; it is not configurable.
The whole point of reducing the deadline from twenty to two seconds is to ensure that when users press the Sleep button on their computers, the computer actually goes to sleep reasonably promptly. If there were a way to extend the deadline, then you're just reintroducing the bad situation in Windows XP where the user hits the Sleep button on the laptop, but the laptop just sits there taunting you. Meanwhile, the flight attendant is standing there getting angrier at you for not putting your laptop away. (Or worse: You tell the laptop to Sleep and toss it into your bag, and then when you reach your destination, you discover that your laptop is really warm, and the battery is dead.)
Besides, imagine if ten apps did this. Your app asks for ten extra seconds, another app asks for another twenty seconds, next thing you know all the deadline extensions add up to five minutes.
The real solution is to fix the driver for this device so it can prepare the device properly when it is told that the machine is about to go into a low power state. (The two-second limit applies to applications but not drivers. At least not yet.)
A customer admitted that they had a bug in their code:
#define UNICODE #define _UNICODE #include <windows.h> // error checking removed for expository purposes // code that writes out the data RegSetValueEx(hkey, pszValue, 0, REG_SZ, (const BYTE *)pszData, _tcslen(pszData) * sizeof(TCHAR) + 1); // code that reads the data DWORD dwType, cbData; RegQueryValueEx(hkey, pszValue, NULL, &dwType, NULL, &cbData); TCHAR *pszData = new TCHAR[cbData / sizeof(TCHAR)]; RegQueryValueEx(hkey, pszValue, NULL, &dwType, pszData, &cbData);
One bug in the above code is in the final parameter passed to RegSetValueEx: It's supposed to be the count in bytes, but the calculation appends only one byte for the terminating null instead of a full TCHAR. In other words, it should be
RegSetValueEx
TCHAR
RegSetValueEx(hkey, pszValue, 0, REG_SZ, (const BYTE *)pszData, _tcslen(pszData) * sizeof(TCHAR) + sizeof(TCHAR));
For concreteness, let's say the original string was five TCHARs in length, not counting the terminating null. Therefore, the correct buffer size is 12 bytes, but they passed only 11 to RegSetValueEx.
This error is compounded in the code that reads the value back: The code happily divides cbData / sizeof(TCHAR) without checking that the division is even. In our example, the call returns a length of 11 bytes. They divide by sizeof(TCHAR) (which is 2, since the code is compiled as Unicode), leaving 5 (remainder discarded), causing them to allocate a 5-TCHAR buffer.
cbData / sizeof(TCHAR)
sizeof(TCHAR)
That error would have been okay by itself except for another error, which is calling RegQueryValueEx a second time with an invalid buffer size: The cbData variable remains the original value of 11 even though they allocated only 10 bytes. The subsequent RegQueryValueEx call reads 11 bytes into a 10-byte buffer.
RegQueryValueEx
cbData
The customer conceded that the code that writes the value is buggy, but points out that the code "worked" on Windows XP, in the sense that the string read back from the registry was correct. But Windows Vista "broke" their program, because the string read back now contained garbage at the end. Instead of returning "Hello", it returned "HelloЀ╅۞". The customer wanted to know what change to Windows Vista broke their program.
"Hello"
"HelloЀ╅۞"
The change to Windows Vista that broke their program is known as "luck running out." The program contained three bugs, which combined to form a heap buffer write overflow. The uninitialized garbage at the end of the heap block they allocated happened to be zero on Windows XP due to a coincidence in the way their program allocated and freed memory. Consequently, when the data was read from the registry, the "string" ended in a single null byte instead of two. The extra null byte that "happened to be there already" combined with the single null byte read from the registry to form a proper null terminator.
When run on Windows Vista, that happy coincidence no longer took place, and the uninitialized garbage was nonzero, resulting in the subsequent attempt to use the string to read past the end of the buffer and pick up heap garbage. (Yay, bug number four: read overflow.) Why was the uninitialized garbage different?
It's different because there was nothing forcing it to be the same. The internals of the heap manager change all the time. (Look-aside lists, low fragmentation heap, and fault-tolerant heap are just a few recent examples.) Any of these changes will result in heap memory being used and reused differently. Plus, changes in other parts of Windows may have allocated and freed memory differently between Windows XP and Windows Vista. Heck, the program itself may have allocated and freed memory differently due to the change in operating system. (For one thing, the length of the string "Windows Vista" is different from the length of the string "Windows XP".)
"Windows Vista"
"Windows XP"
Uninitialized garbage will contain unpredictable values. There's no point asking why you got a different unpredictable value this time.
Here's a little trick. Open your Start menu and type scrensaver into the search box. That's right, spell it with only one e.
Hey, it still found the Control Panel options for managing your screen saver.
If you enable Improve my search results by using online Help in Windows Help and Support, this sends your search query to a back-end server to see if there's updated online help content related to your search. And the people who develop the online help content look over those queries to see if, for example, there is a category of issues people are searching for help with and not finding anything.
It also means that they get to see a lot of misspellings. And that information is useful, because it means that the Control Panel search provider can be given a table of the most popular misspellings and how to fix them. which in turn benefits people who unchecked the Improve my search results by using online Help (say because they don't have Internet access or because they are afraid of the black helicopters).
So now, when you search for scrensaver, you are directed to information on screen savers, thanks to the millions of other people who spelled it wrong before you.
Obligatory link: The so-called God mode.
Commenter Vilx- wondered why animated GIFs weren't used as the animation format for the shell animation common control. After all, "they are even more lightweight than AVIs."
Animated GIFs are certainly more lightweight than general AVIs, since AVI is just a container format, so decoding a general AVI means decoding any encoding format invented now or in the future. On the other hand, the shell common control imposed enough limits on the type of AVIs it could handle to the point where what was left was extremely lightweight, certainly much more lightweight than an animated GIF.
Think about it: To use an animated GIF, you need a GIF decoder. And a GIF decoder is already significantly larger (both in terms of code and memory) than the RLE-8 decoder. Also significantly more complicated, and therefore significantly more likely to have bugs. Whereas RLE-8 is so simple there isn't much that can go wrong, and the RLE-8 decoder had been around since Windows 3.0, so it was already a known quantity. All you have to do to invoke the RLE-8 decoder is call SetDIBitsToDevice. One line of code is hard to beat.
SetDIBitsToDevice
Windows 95 did not come with a GIF decoder. Remember, Internet Explorer 1.0 did not come with Windows 95; it was part of the Plus! pack. As I recall, at the time Windows 95 released to manufacturing, the Plus! pack was still under development. (And at the time the animation common control was being designed, Internet Explorer didn't exist. Heck, Mosaic didn't exist!) Plus the fact that the common controls were available in both 16-bit and 32-bit versions—in fact it was the 16-bit versions that were written first since Windows 95 didn't have good Win32 support at the start of the project. More accurately, Windows 95 didn't have any Win32 support at the start of the project.
So I'm kind of amused by the description of GIF as a lightweight animation encoding algorithm. Compared to RLE, the GIF format weighs a ton!
A customer asked a question, but as is often the case, the question was much more telling than the answer.
We have an Internet Explorer plug-in which calls ExitProcess to force Internet Explorer to exit. We found that when we do this, our plug-in does not receive a DLL_PROCESS_DETACH notification. What could be preventing our plug-in from receiving the DLL_PROCESS_DETACH notification?
ExitProcess
DLL_PROCESS_DETACH
As we saw some time ago when we looked at the way processes shut down (plus an important follow-up or two), all a process has to do to thwart proper delivery of DLL_PROCESS_DETACH notifications is to do something untoward during shutdown, at which point the kernel just gives up and calls TerminateProcess.
TerminateProcess
But like I said, the answer is much less interesting than the question. What if the user had an unsaved email message at the time you decided to exit Internet Explorer? Recall that plug-ins are a guest in the host process; don't go changing the carpet. When we asked the customer why they were exiting Internet Explorer from their plug-in, we received the explanation, "The reason I am calling ExitProcess is that I do not know another good way to exit Internet Explorer from a plug-in."
In this case, the guest is doing far more than just changing the carpet. The guest called in a demolition company!
"Why did you call the demolition company to destroy my house?" "I couldn't think of a good way to destroy your house."
The point isn't that it's bad to use a telephone call to hire a demolition company to destroy somebody's house and that you should use some other method to contact them (like, say, a text message). The point is that it's bad to destroy somebody else's house in the first place.
Upon further investigation, the customer was writing a test for their plug-in. They open Internet Explorer and navigate to a page that uses the plug-in. When they are satisfied that the plug-in operated correctly, they want to exit the copy of Internet Explorer in order to conclude the test.
If you want to destroy a house, then destroy your own house. Call CoCreateInstance(CLSID_InternetExplorer) to build a house, navigate to your test page with IWebBrowser2::Navigate, and when you're done, you can destroy the house with IWebBrowser2::Quit(). There is sample code to do exactly this in the documentation for the IWebBrowser2 interface.
CoCreateInstance(CLSID_InternetExplorer)
IWebBrowser2::Navigate
IWebBrowser2::Quit()
IWebBrowser2
Bonus chatter: The IWebBrowser2 interface is scriptable.
var ie = new ActiveXObject("InternetExplorer.Application"); ie.Visible = true; ie.Navigate("http://www.microsoft.com/"); WScript.Sleep(5000); // five seconds, say ie.Quit();
A customer asked for help installing a particular DLL. They ran the command regsvr /i SomeDll.dll but got the error "SomeDll.dll was loaded, but the DllInstall entry point was not found. This file can not be registered."
regsvr /i SomeDll.dll
A DLL needs to be specifically written to be used with the regsvr32 command. You can't just grab some random DLL and expect regsvr32 to work. As we saw last week, the regsvr32 program merely loads the specified DLL and calls an entry point established by convention. If the DLL was not written to be used with regsvr32 then the conventional entry point will not be found, and you get an error message.
regsvr32
The /i switch to regsvr32 instructs the program to look for the entry point known as DllInstall. By convention, the DllInstall function performs installation and setup of a DLL, but since it's just some function exported by a DLL, it could do anything it wants (or nothing at all).
/i
DllInstall
You can't just grab a random DLL and expect regsvr32 to do anything meaningful with it. The DLL has to be designed to operate with regsvr32.
Handing random DLLs to regsvr32 is like dialing a random telephone number, sending a tone at 1170Hz and getting upset when you don't get a 2125Hz tone in response.
The number one hit for a search on what does regsvr32 do is an old Knowledge Base article which explains what regsvr32 does, and it even contains a sample program which emulates regsvr32 so you can use it to debug your DLL. (The sample program hasn't been updated to support the /i flag, which I leave as an exercise.)
One day, I received a piece of email from another employee whom I had never met nor had ever heard of. It didn't even begin with an introduction; it just jumped right in as if we'd been friends for years. "I'm trying to debug a problem where regsvr32 cannot register my DLL. It gives the error 'The specified procedure could not be found.' I saw a blog entry written by you and am trying to understand what our problem is."
This blog thing has backfired. The reasons I write these articles is to get people to stop asking me questions. (The mechanism for that being to give the answer out in public for everyone to see.) Instead, it turns into "Hi, I found an article you wrote about X, which ipso facto makes you not only the world's foremost authority on X, but also the world's leading support technician on X."
News flash: Posting a blog entry about something on the Internet should not be taken as evidence that the author is an expert on that subject. (One might argue that it in fact is more likely to be the opposite.)
At the time, I wasn't aware of the knowledge base article that explains what regsvr32 does and how to debug it, so I couldn't point to it. I wrote back, "All regsvr32 does is LoadLibrary, GetProcAddress, and then calls the function. You can write your own test that does the same thing. You do not require any expertise from me."
LoadLibrary
GetProcAddress
Less than an hour later, I received a reply: "Thanks. I figured it out. There was an older version of the DLL in the path ahead of the one I was trying to register."
And I never did figure out which blog entry I wrote that made them think I was an expert on regsvr32. Maybe the person worked in Microsoft Research and used a prototype of their machine that predicts the future, and used it to predict that I was going to write about regsvr32 two years later.