April, 2013

  • The Old New Thing

    Technically not lying, but not exactly admitting fault either


    I observed a spill suspiciously close to a three-year-old's play table. I asked, "How did the floor get wet?"

    She replied, "Water."

    It's not lying, but it's definitely not telling the whole story. She'll probably grow up to become a lawyer.

  • The Old New Thing

    Why does CoCreateInstance work even though my thread never called CoInitialize? The curse of the implicit MTA


    While developing tests, a developer observed erratic behavior with respect to Co­Create­Instance:

    In my test, I call Co­Create­Instance and it fails with CO_E_NOT­INITIALIZED. Fair enough, because my test forgot to call Co­Initialize.

    But then I went and checked the production code: In response to a client request, the production code creates a brand new thread to service the request. The brand new thread does not call Co­Initialize, yet its call to Co­Create­Instance succeeds. How is that possible? I would expect the production code to also get a CO_E_NOT­INITIALIZED error.

    I was able to debug this psychically, but only because I knew about the implicit MTA.

    The implicit MTA is not something I can find very much documentation on, except in the documentation for the APP­TYPE­QUALIFIER enumeration, where it mentions:

    [The APT­TYPE­QUALIFIER_IMPLICIT_MTA] qualifier is only valid when the pAptType parameter of the Co­Get­Apartment­Type function specifies APT­TYPE_MTA on return. A thread has an implicit MTA apartment type if it does not initialize the COM apartment itself, and if another thread has already initialized the MTA in the process. This qualifier informs the API caller that the MTA of the thread is implicitly inherited from other threads and is not initialized directly.

    Did you get that? If any thread in the process calls Co­Initialize­[Ex] with the COINIT_MULTI­THREADED flag, then that not only initializes the current thread as a member of the multi-threaded apartment, but it also says, "Any thread which has never called Co­Initialize­[Ex] is also part of the multi-threaded apartment."

    Further investigation revealed that yes, some other thread in the process called Co­Initialize­Ex(0, COINIT_MULTI­THREADED), which means that the thread which forgot to call Co­Initialize was implicitly (and probably unwittingly) placed in the MTA.

    The danger of this implicit MTA, of course, is that since you didn't know you were getting it, you also don't know if you're going to lose it. If that other thread which called Co­Initialize­Ex(0, COINIT_MULTI­THREADED) finally gets around to calling Co­Un­initialize, then it will tear down the MTA, and your thread will have the MTA rug ripped out from under it.

    Moral of the story: If you want the MTA, make sure you ask for it explicitly. And if you forget, you may end up in the implicit MTA, whether you wanted it or not. (Therefore, conversely, if you don't want the MTA, make sure to deny it explicitly!)

    Exercise: Use your psychic debugging skills to diagnose the following problem. "When my code calls Get­Open­File­Name, it behaves erratically. I saw a Knowledge Base article that says that this can happen if I initialize my thread in the multi-threaded apartment, but my thread does not do that."

  • The Old New Thing

    How can I figure out which user modified a file?


    The Get­File­Time function will tell you when a file was last modified, but it won't tell you who did it. Neither will Find­First­File, Get­File­Attributes, or Read­Directory­ChangesW, or File­System­Watcher.

    None of these the file system functions will tell you which user modified a file because the file system doesn't keep track of which user modified a file. But there is somebody who does keep track: The security event log.

    To generate an event into the security event log when a file is modified, you first need to enable auditing on the system. In the Local Security Policy administrative tool, go to Local Policies, and then double-click Audit Policy. (These steps haven't changed since Windows 2000; the only thing is that the Administrative Tools folder moves around a bit.) Under Audit Object Access, say that you want an audit raised when access is successfully granted by checking Success (An audited security access attempt that succeeds).

    Once auditing is enabled, you can then mark the files that you want to track modifications to. On the Security tab of each file you are interested in, go to the Auditing page, and select Add to add the user you want to audit. If you want to audit all accesses, then you can choose Everyone; if you are only interested in auditing a specific user or users in specific groups, you can enter the user or group.

    After specifying whose access you want to monitor, you can select what actions should generate security events. In this case, you want to check the Successful box next to Create files / write data. This means "Generate a security event when the user requests and obtains permission to create a file (if this object is a directory) or write data (if this object is a file)."

    If you want to monitor an entire directory, you can set the audit on the directory itself and specify that the audit should apply to objects within the directory as well.

    After you've set up your audits, you can view the results in Event Viewer.

    This technique of using auditing to track who is generating modifications also works for registry keys: Under the Edit menu, select Permissions.

    Exercise: You're trying to debug a problem where a file gets deleted mysteriously, and you're not sure which program is doing it. How can you use this technique to log an event when that specific file gets deleted?

  • The Old New Thing

    If you don't know what you're going to do with the answer to a question, then there's not much point in making others work hard to answer it


    A customer asked the following question:

    We've found that on Windows XP, when we call the XYZ function with the Awesome flag, the function fails for no apparent reason. However, it works correctly on Windows 7. Do you have any ideas about this?

    So far, the customer has described what they have observed, but they haven't actually asked a question. It's just nostalgia, and nostalgia is not a question. (I'm rejecting "Do you have an ideas about this?" as a question because it too vague to be a meaningful question.)

    Please be more specific about your question. Do you want to obtain Windows 7-style behavior on Windows XP? Do you want to obtain Windows XP-style behavior on Windows 7? Do you merely want to understand why the two behave differently?

    The customer replied,

    Why do they behave differently? Was it a new design for Windows 7? If so, how do the two implementations differ?

    I fired up a handy copy of Windows XP in a virtual machine and started stepping through the code, and then I stopped and realized I was about to do a few hours' worth of investigation for no clear benefit. So I stopped and responded to their question with my own question.

    Why do you want to know the reason for the change in behavior? How will the answer affect what you do next? Consider the following three answers:

    1. "The behavior was redesigned in Windows 7."
    2. "The Windows XP behavior was a bug that was fixed in Windows 7."
    3. "The behavior change was a side-effect of a Windows Update hotfix."

    What will you do differently if the answer is (1) rather than (2) or (3)?

    The customer never responded. That saved me a few hours of my life.

    If you don't know what you're going to do with the answer to a question, then there's not much point in others working hard to answer it. You're just creating work for others for no reason.

  • The Old New Thing

    Dangerous setting is dangerous: This is why you shouldn't turn off write cache buffer flushing


    Okay, one more time about the Write-caching policy setting.

    This dialog box takes various forms depending on what version of Windows you are using.

    Windows XP:

      Enable write caching on the disk
    This setting enables write caching in Windows to improve disk performance, but a power outage or equipment failure might result in data loss or corruption.

    Windows Server 2003:

      Enable write caching on the disk
    Recommended only for disks with a backup power supply. This setting further improves disk performance, but it also increases the risk of data loss if the disk loses power.

    Windows Vista:

      Enable advanced performance
    Recommended only for disks with a backup power supply. This setting further improves disk performance, but it also increases the risk of data loss if the disk loses power.

    Windows 7 and 8:

      Turn off Windows write-cache buffer flushing on the device
    To prevent data loss, do not select this check box unless the device has a separate power supply that allows the device to flush its buffer in case of power failure.

    Notice that the warning text gets more and more scary each time it is updated. It starts out just by saying, "If you lose power, you might have data loss or corruption." Then it adds a recommendation, "Recommended only for disks with a backup power supply." And then it comes with a flat-out directive: "Do not select this check box unless the device has a separate power supply."

    The scary warning is there for a reason: If you check the box when your hardware does not satisfy the criteria, you risk data corruption.

    But it seems that even with the sternest warning available, people will still go in and check the box even though their device does not satisfy the criteria, and the dialog box says right there do not select this check box.

    And then they complain, "I checked this box, and my hard drive was corrupted! You need to investigate the issue and release a fix for it."

    Dangerous setting is dangerous.

    At this point, I think the only valid "fix" for this feature would be to remove it entirely. This is why we can't have dangerous things.

  • The Old New Thing

    Your tenant and your lover, in your dreams


    I dreamed that a friend of mind said, "Between your tenant and your lover, you should get along with at least one of them."

  • The Old New Thing

    Using opportunistic locks to get out of the way if somebody wants the file


    Opportunistic locks allow you to be notified when somebody else tries to access a file you have open. This is usually done if you want to use a file provided nobody else wants it.

    For example, you might be a search indexer that wants to extract information from a file, but if somebody opens the file for writing, you don't want them to get Sharing Violation. Instead, you want to stop indexing the file and let the other person get their write access.

    Or you might be a file viewer application like ildasm, and you want to let the user update the file (in ildasm's case, rebuild the assembly) even though you're viewing it. (Otherwise, they will get an error from the compiler saying "Cannot open file for output.")

    Or you might be Explorer, and you want to abandon generating the preview for a file if somebody tries to delete it.

    (Rats I fell into the trap of trying to motivate a Little Program.)

    Okay, enough motivation. Here's the program:

    #include <windows.h>
    #include <winioctl.h>
    #include <stdio.h>
    int __cdecl wmain(int argc, wchar_t **argv)
      g_o.hEvent = CreateEvent(nullptr, FALSE, FALSE, nullptr);
      HANDLE hFile = CreateFileW(argv[1], GENERIC_READ,
        FILE_FLAG_OVERLAPPED, nullptr);
      if (hFile == INVALID_HANDLE_VALUE) {
        return 0;
      DeviceIoControl(hFile, FSCTL_REQUEST_OPLOCK,
          &g_inputBuffer, sizeof(g_inputBuffer),
          &g_outputBuffer, sizeof(g_outputBuffer),
          nullptr, &g_o);
      if (GetLastError() != ERROR_IO_PENDING) {
        // oplock failed
        return 0;
      DWORD dwBytes;
      if (!GetOverlappedResult(hFile, &g_o, &dwBytes, TRUE)) {
        // oplock failed
        return 0;
      printf("Cleaning up because somebody wants the file...\n");
      Sleep(1000); // pretend this takes some time
      printf("Closing file handle\n");
      return 0;

    Run this program with the name of an existing file on the command line, say scratch x.txt. The program will wait.

    In another command window, run the command type x.txt. The program keeps waiting.

    Next, run the command echo hello > x.txt. Now things get interesting.

    When the command prompt opens x.txt for writing, the Device­Io­Control call completes. At this point we print the Cleaning up... message.

    To simulate the program taking a little while to clean up, we sleep for one second. Observe that the command prompt has not yet returned. Instead of immediately failing the request to open for writing with a sharing violation, the kernel puts the open request on hold to give our program time to clean up and close our handle.

    Finally, our simulated clean-up is complete, and we close the handle. At this point, the kernel allows the command processor to proceed and open the file for writing so it can write hello into it.

    That's the basics of opportunistic locks, but your program will almost certainly not be structured this way. You will probably not wait synchronously on the overlapped I/O but rather have the completion queued up to a completion function, an I/O completion port, or have a thread pool task listen on the event handle. When you do that, remember that you need to keep the OVERLAPPED structure as well as the REQUEST_OPLOCK_INPUT_BUFFER and REQUEST_OPLOCK_OUTUT_BUFFER structures valid until the I/O completes.

    (You may find the Cancel­Io function handy to try to accelerate the clean-up of the file handle and any other actions that are dependent upon it.)

    You can read more about opportunistic locks on MSDN. Note that there are limitations on explicitly-managed opportunistic locks; for example, they don't work across the network.

  • The Old New Thing

    The phenomenon of houses with nobody living inside, for perhaps-unexpected reasons


    In London, some of the most expensive real estate is in neighborhoods where relatively few people actually live. According to one company's estimate, 37% of the the residences have been purchased by people who merely use them as vacation homes, visiting only for a week or two per year and leaving the building empty the remainder of the year. In other words, the people who can afford to live there choose not to.

    This same phenomenon is reported in other cities. For example, only 10% of the condos in the Plaza Hotel are occupied full-time.

    Another example of a house with nobody living inside is the case where the house is a façade for an industrial building, most commonly an electrical substation or a subway ventilation shaft.

    I find both categories fascinating.

  • The Old New Thing

    Is it legal to have a cross-process parent/child or owner/owned window relationship?


    A customer liaison asked whether it was legal to use Set­Parent to create a parent/child relationship between windows which belong to different processes. "If I remember correctly, the documentation for Set­Parent used to contain a stern warning that it is not supported, but that remark does not appear to be present any more. I have a customer who is reparenting windows between processes, and their application is experiencing intermittent instability."

    Is it technically legal to have a parent/child or owner/owned relationship between windows from different processes?

    Yes, it is technically legal.

    It is also technically legal to juggle chainsaws.

    Creating a cross-thread parent/child or owner/owned window relationship implicitly attaches the input queues of the threads which those windows belong to, and this attachment is transitive: If one of those queues is attached to a third queue, then all three queues are attached to each other. More generally, queues of all windows related by a chain of parent/child or owner/owned or shared-thread relationships are attached to each other.

    Exercise: What are the equivalence classes generated by taking the transitive closure of parent/child windows, and what would be a natural choice of class representative? What about the equivalence classes generated by the transitive closure of parent/child and owner/owned windows?

    This gets even more complicated when the parent/child or owner/owned relationship crosses processes, because cross-process coordination is even harder than cross-thread coordination. Sharing variables within a process is much easier than sharing variables across processes. On top of that, some window messages are blocked between processes.

    So yes, it is technically legal, but if you create a cross-process parent/child or owner/owned relationship, the consequences can be very difficult to manage. And they become near-impossible to manage if one or both of the windows involved is unaware that it is participating in a cross-process window tree. (I often see this question in the context of somebody who wants to grab a window belonging to another process and forcibly graft it into their own process. That other process was totally unprepared for its window being manipulated in this way, and things may stop working. Indeed, things will definitely stop working if you change that other window from a top-level window to a child window.)

    The existing text was probably removed when somebody pointed out that the action is technically legal (though not recommended for beginners), and instead of trying to come up with new text that describes the situation, merely removed the text that was incorrect. The problem with coming up with new text that describes the situation is that it only leads to more questions from people who want to do it in spite of the warnings. (It's one of those "if you don't already know what the consequences are, then you are not smart enough to do it correctly" things. You must first become the master of the rules before you can start breaking them.)

  • The Old New Thing

    The importance of having a review panel of twelve-year-old boys, episode 2


    Microsofties love their acronyms, but you have to remember to send every potential name through a review panel of twelve-year-old boys to identify the lurking embarrassments.

    When it came time in Windows 7 to come up with the names of the various subteams, two of the proposed names were Core OS eXperience and Find and Use eXperience, using the trendy letter X to abbreviate the trendy capitalization of the word eXperience.

    The naming system was promptly reconsidered.

    One of the subteams of Windows 8 is known as User-Centered Experience. The original name of the subteam was the You-Centered Experience (because it's all about you, the user), and they somewhat inadvisedly decided to adopt the nickname YOU, believing themselves to be sooooo clever.

    What this actually did was create Abbot-and-Costello-level confusion.

    "There's a work item assigned to YOU to handle this case."

    No, I don't have any such work item.

    "No, not you. I mean the YOU team."

    Some time after the standard acronyms and abbreviations for all the teams were settled upon, one of the reporting systems used to track the progress of the project was set up to allow reports to be generated not only for specific individuals or lists of individuals, but also for organizational units or feature teams. If you wanted to generate a report for Bob and everybody who reports through him, you could enter o_bob as the target of the report instead of having to type the name of every single person who worked for Bob. And if you wanted to generate a report for everybody who works on the XYZ feature team, you could enter f_xyz.

    This meant that generating the reports for the YOU team required you to type f_you. The members of the YOU team were not pleased by this, and they prevailed upon the people who run the reporting system to change their notation. The request was granted, and the syntax for selecting an entire feature team was changed to ft_xyz instead of just f_xyz.

    I would have argued that this was a problem of the YOU team's own creation. Next time, don't pick such a confusing name for your team.

    Bonus chatter: During Windows XP development, we didn't use these fancy team acronyms. The teams were simply numbered. The kernel and drivers team was team 1. The terminal services team was team 4. The user interface was team 6. I forget most of the other numbers. But as I recall, there was no team 7, perhaps in tribute to Building 7.

Page 2 of 3 (30 items) 123