April, 2013

  • The Old New Thing

    Dangerous setting is dangerous: This is why you shouldn't turn off write cache buffer flushing

    • 71 Comments

    Okay, one more time about the Write-caching policy setting.

    This dialog box takes various forms depending on what version of Windows you are using.

    Windows XP:

      Enable write caching on the disk
    This setting enables write caching in Windows to improve disk performance, but a power outage or equipment failure might result in data loss or corruption.

    Windows Server 2003:

      Enable write caching on the disk
    Recommended only for disks with a backup power supply. This setting further improves disk performance, but it also increases the risk of data loss if the disk loses power.

    Windows Vista:

      Enable advanced performance
    Recommended only for disks with a backup power supply. This setting further improves disk performance, but it also increases the risk of data loss if the disk loses power.

    Windows 7 and 8:

      Turn off Windows write-cache buffer flushing on the device
    To prevent data loss, do not select this check box unless the device has a separate power supply that allows the device to flush its buffer in case of power failure.

    Notice that the warning text gets more and more scary each time it is updated. It starts out just by saying, "If you lose power, you might have data loss or corruption." Then it adds a recommendation, "Recommended only for disks with a backup power supply." And then it comes with a flat-out directive: "Do not select this check box unless the device has a separate power supply."

    The scary warning is there for a reason: If you check the box when your hardware does not satisfy the criteria, you risk data corruption.

    But it seems that even with the sternest warning available, people will still go in and check the box even though their device does not satisfy the criteria, and the dialog box says right there do not select this check box.

    And then they complain, "I checked this box, and my hard drive was corrupted! You need to investigate the issue and release a fix for it."

    Dangerous setting is dangerous.

    At this point, I think the only valid "fix" for this feature would be to remove it entirely. This is why we can't have dangerous things.

  • The Old New Thing

    If you're going to use an interlocked operation to generate a unique value, you need to use it before it's gone

    • 35 Comments

    Is the Interlocked­Increment function broken? One person seemed to think so.

    We're finding that the Interlocked­Increment is producing duplicate values. Are there are any know bugs in Interlocked­Increment?

    Because of course when something doesn't work, it's because you are the victim of a vast conspiracy. There is a fundamental flaw in the Interlocked­Increment function that only you can see. You are not a crackpot.

    LONG g_lNextAvailableId = 0;
    
    DWORD GetNextId()
    {
      // Increment atomically
      InterlockedIncrement(&g_lNextAvailableId);
    
      // Subtract 1 from the current value to get the value
      // before the increment occurred.
      return (DWORD)g_lNextAvailableId - 1;
    }
    

    Recall that Interlocked­Increment function increments a value atomically and returns the incremented value. If you are interested in the result of the increment, you need to use the return value directly and not try to read the variable you incremented, because that variable may have been modified by another thread in the interim.

    Consider what happens when two threads call Get­Next­Id simultaneously (or nearly so). Suppose the initial value of g_lNext­Available­Id is 4.

    • First thread calls Interlocked­Increment to increment from 4 to 5. The return value is 5.
    • Second thread calls Interlocked­Increment to increment from 5 to 6. The return value is 6.
    • First thread ignores the return value and instead reads the current value of g_lNext­Available­Id, which is 6. It subtracts 1, leaving 5, and returns it.
    • Second thread ignores the return value and instead reads the current value of g_lNext­Available­Id, which is still 6. It subtracts 1, leaving 5, and returns it.

    Result: Both calls to Get­Next­Id return 5. Interpretation: "Interlocked­Increment is broken."

    Actually, Interlocked­Increment is working just fine. What happened is that the code threw away the unique information that Interlocked­Increment returned and instead went back to the shared variable, even though the shared variable changed its value in the meantime.

    Since this code cares about the result of the increment, it needs to use the value returned by Interlocked­Increment.

    DWORD GetNextId()
    {
      // Increment atomically and subtract 1 from the
      // incremented value to get the value before the
      // increment occurred.
      return (DWORD)InterlockedIncrement(&g_lNextAvailableId) - 1;
    }
    

    Exercise: Criticize this implementation of IUnknown::Release:

    STDMETHODIMP_(ULONG) CObject::Release()
    {
     InterlockedDecrement(&m_cRef);
     if (m_cRef == 0)
     {
      delete this;
      return 0;
     }
     return m_cRef;
    }
    
  • The Old New Thing

    If you don't know what you're going to do with the answer to a question, then there's not much point in making others work hard to answer it

    • 32 Comments

    A customer asked the following question:

    We've found that on Windows XP, when we call the XYZ function with the Awesome flag, the function fails for no apparent reason. However, it works correctly on Windows 7. Do you have any ideas about this?

    So far, the customer has described what they have observed, but they haven't actually asked a question. It's just nostalgia, and nostalgia is not a question. (I'm rejecting "Do you have an ideas about this?" as a question because it too vague to be a meaningful question.)

    Please be more specific about your question. Do you want to obtain Windows 7-style behavior on Windows XP? Do you want to obtain Windows XP-style behavior on Windows 7? Do you merely want to understand why the two behave differently?

    The customer replied,

    Why do they behave differently? Was it a new design for Windows 7? If so, how do the two implementations differ?

    I fired up a handy copy of Windows XP in a virtual machine and started stepping through the code, and then I stopped and realized I was about to do a few hours' worth of investigation for no clear benefit. So I stopped and responded to their question with my own question.

    Why do you want to know the reason for the change in behavior? How will the answer affect what you do next? Consider the following three answers:

    1. "The behavior was redesigned in Windows 7."
    2. "The Windows XP behavior was a bug that was fixed in Windows 7."
    3. "The behavior change was a side-effect of a Windows Update hotfix."

    What will you do differently if the answer is (1) rather than (2) or (3)?

    The customer never responded. That saved me a few hours of my life.

    If you don't know what you're going to do with the answer to a question, then there's not much point in others working hard to answer it. You're just creating work for others for no reason.

  • The Old New Thing

    Dark corners of C/C++: The typedef keyword doesn't need to be the first word on the line

    • 29 Comments

    Here are some strange but legal declarations in C/C++:

    int typedef a;
    short unsigned typedef b;
    

    By convention, the typedef keyword comes at the beginning of the line, but this is not actually required by the language. The above declarations are equivalent to

    typedef int a;
    typedef short unsigned b;
    

    The C language (but not C++) also permits you to say typedef without actually defining a type!

    typedef enum { c }; // legal in C, not C++
    

    In the above case, the typedef is ignored, and it's the same as just declaring the enum the plain boring way.

    enum { c };
    

    Other weird things you can do with typedef in C:

    typedef;
    typedef int;
    typedef int short;
    

    None of the above statements do anything, but they are technically legal in pre-C89 versions of the C language. They are just alternate manifestations of the quirk in the grammar that permits you to say typedef without actually defining a type. (In C89, this loophole was closed: Clause 6.7 Constraint 2 requires that "A declaration shall declare at least a declarator, a tag, or the members of an enumeration.")

    That last example of typedef int short; is particularly misleading, since at first glance it sounds like it's redefining the short data type. But then you realize that int short and short int are equivalent, and this is just an empty declaration of the short int data type. It doesn't actually widen your shorts. If you need to widen your shorts, go see a tailor.¹

    Note that just because it's legal doesn't mean it's recommended. You should probably stick to using typedef the way most people use it, unless you're looking to enter the IOCCC.

    ¹ The primary purpose of this article was to tell that one stupid joke. And it's not even my joke!

  • The Old New Thing

    The problem with adding more examples and suggestions to the documentation is that eventually people will stop reading the documentation

    • 27 Comments

    I am a member of a peer-to-peer discussion group on an internal tool for programmers which we'll call Program Q. Every so often, somebody will get tripped up by smart quotes or en-dashes or ellipses, and they will get an error like

    C:\> q select table –s “awesome table”
    Usage: q select table [-n] [-s] table
    Error: Must specify exactly one table.
    

    After it is pointed out that they are a victim of Word's auto-conversion of straight quotes to slanted quotes, there will often be a suggestion, "You should treat en-dashes as plain dashes, smart quotes as straight quotes, and fancy-ellipses as three periods."

    The people who support Program Q are members of this mailing list, and they explain that unfortunately for Program Q, those characters have been munged by internal processing to the point that when they reach the command line parser, they have been transformed into characters like ô and ö, so the parser doesn't even know that it's dealing with an en-dash or smart-quote or fancy-ellipsis.

    Plus, this is a programming tool. Programmers presumably prefer consistent and strict behavior rather than auto-correcting guess-what-I-really-meant behavior. One of the former members of the Program Q support team recalled,

    It might be possible to detect potential unintended goofiness and raise an error, but that creates the possibility of false positives, which in turn creates its own set of support issues that are more difficult to troubleshoot and resolve. Sometimes it's better to just let a failure fail at the point of failure rather than trying to be clever.

    There was a team that had a script that started up the Program Q server, and if there was a problem starting the server, it restored the databases from a backup. Automated failure recovery, what could possibly go wrong? Well, what happened is that the script decided to auto-restore from a week-old backup and thereby wiped out a week's worth of work. And it turns out that the failure in question was not caused by database corruption in the first place. Oops.

    "Well, if you're not going to do auto-correction, at least you should add this explanation to the documentation."

    The people who support Program Q used to take these suggestions to heart, and when somebody said, "You should mention this in the documentation," they would more than not go ahead and add it to the documentation.

    But that merely created a new phenomenon:

    I can't get Program Q to create a table. I tried q create -template awesome_template awesome_table, but I keep getting the error "Template 'awesome_template' does not exist in the default namespace. Check that the template exists in the specified location. See 'q help create -template' for more information." What am I doing wrong?

    Um, did you check that the template exists in the specified location?

    "No, I haven't. Should I?"

    (Facepalm.)

    After some troubleshooting, the people on the discussion group determine that the problem was that the template was created in a non-default namespace, so you had to use a full namespace qualifier to specify the template. (I'm totally making this up, I hope you realize. The actual Program Q doesn't have a template-create command. I'm just using this as a fake example for the purpose of storytelling.)

    After this all gets straightened out, somebody will mention, "This is explained in the documentation for template creation. Did you read it?"

    "I didn't read the documentation because it was too long."

    If you follow one person's suggestion to add more discussion to the documentation, you end up creating problems for all the people who give up on the documentation because it's too long, regardless of how well-organized it is. In other words, sometimes adding documentation makes things worse. The challenge is to strike a decent balance.

    Pre-emptive snarky comment: "TL;DR."

  • The Old New Thing

    The importance of having a review panel of twelve-year-old boys, episode 2

    • 27 Comments

    Microsofties love their acronyms, but you have to remember to send every potential name through a review panel of twelve-year-old boys to identify the lurking embarrassments.

    When it came time in Windows 7 to come up with the names of the various subteams, two of the proposed names were Core OS eXperience and Find and Use eXperience, using the trendy letter X to abbreviate the trendy capitalization of the word eXperience.

    The naming system was promptly reconsidered.

    One of the subteams of Windows 8 is known as User-Centered Experience. The original name of the subteam was the You-Centered Experience (because it's all about you, the user), and they somewhat inadvisedly decided to adopt the nickname YOU, believing themselves to be sooooo clever.

    What this actually did was create Abbot-and-Costello-level confusion.

    "There's a work item assigned to YOU to handle this case."

    No, I don't have any such work item.

    "No, not you. I mean the YOU team."

    Some time after the standard acronyms and abbreviations for all the teams were settled upon, one of the reporting systems used to track the progress of the project was set up to allow reports to be generated not only for specific individuals or lists of individuals, but also for organizational units or feature teams. If you wanted to generate a report for Bob and everybody who reports through him, you could enter o_bob as the target of the report instead of having to type the name of every single person who worked for Bob. And if you wanted to generate a report for everybody who works on the XYZ feature team, you could enter f_xyz.

    This meant that generating the reports for the YOU team required you to type f_you. The members of the YOU team were not pleased by this, and they prevailed upon the people who run the reporting system to change their notation. The request was granted, and the syntax for selecting an entire feature team was changed to ft_xyz instead of just f_xyz.

    I would have argued that this was a problem of the YOU team's own creation. Next time, don't pick such a confusing name for your team.

    Bonus chatter: During Windows XP development, we didn't use these fancy team acronyms. The teams were simply numbered. The kernel and drivers team was team 1. The terminal services team was team 4. The user interface was team 6. I forget most of the other numbers. But as I recall, there was no team 7, perhaps in tribute to Building 7.

  • The Old New Thing

    Using opportunistic locks to get out of the way if somebody wants the file

    • 24 Comments

    Opportunistic locks allow you to be notified when somebody else tries to access a file you have open. This is usually done if you want to use a file provided nobody else wants it.

    For example, you might be a search indexer that wants to extract information from a file, but if somebody opens the file for writing, you don't want them to get Sharing Violation. Instead, you want to stop indexing the file and let the other person get their write access.

    Or you might be a file viewer application like ildasm, and you want to let the user update the file (in ildasm's case, rebuild the assembly) even though you're viewing it. (Otherwise, they will get an error from the compiler saying "Cannot open file for output.")

    Or you might be Explorer, and you want to abandon generating the preview for a file if somebody tries to delete it.

    (Rats I fell into the trap of trying to motivate a Little Program.)

    Okay, enough motivation. Here's the program:

    #include <windows.h>
    #include <winioctl.h>
    #include <stdio.h>
    
    OVERLAPPED g_o;
    
    REQUEST_OPLOCK_INPUT_BUFFER g_inputBuffer = {
      REQUEST_OPLOCK_CURRENT_VERSION,
      sizeof(g_inputBuffer),
      OPLOCK_LEVEL_CACHE_READ | OPLOCK_LEVEL_CACHE_HANDLE,
      REQUEST_OPLOCK_INPUT_FLAG_REQUEST,
    };
    
    REQUEST_OPLOCK_OUTPUT_BUFFER g_outputBuffer = {
      REQUEST_OPLOCK_CURRENT_VERSION,
      sizeof(g_outputBuffer),
    };
    
    int __cdecl wmain(int argc, wchar_t **argv)
    {
      g_o.hEvent = CreateEvent(nullptr, FALSE, FALSE, nullptr);
    
      HANDLE hFile = CreateFileW(argv[1], GENERIC_READ,
        FILE_SHARE_READ, nullptr, OPEN_EXISTING,
        FILE_FLAG_OVERLAPPED, nullptr);
      if (hFile == INVALID_HANDLE_VALUE) {
        return 0;
      }
    
      DeviceIoControl(hFile, FSCTL_REQUEST_OPLOCK,
          &g_inputBuffer, sizeof(g_inputBuffer),
          &g_outputBuffer, sizeof(g_outputBuffer),
          nullptr, &g_o);
      if (GetLastError() != ERROR_IO_PENDING) {
        // oplock failed
        return 0;
      }
    
      DWORD dwBytes;
      if (!GetOverlappedResult(hFile, &g_o, &dwBytes, TRUE)) {
        // oplock failed
        return 0;
      }
    
      printf("Cleaning up because somebody wants the file...\n");
      Sleep(1000); // pretend this takes some time
    
      printf("Closing file handle\n");
      CloseHandle(hFile);
    
      CloseHandle(g_o.hEvent);
    
      return 0;
    }
    

    Run this program with the name of an existing file on the command line, say scratch x.txt. The program will wait.

    In another command window, run the command type x.txt. The program keeps waiting.

    Next, run the command echo hello > x.txt. Now things get interesting.

    When the command prompt opens x.txt for writing, the Device­Io­Control call completes. At this point we print the Cleaning up... message.

    To simulate the program taking a little while to clean up, we sleep for one second. Observe that the command prompt has not yet returned. Instead of immediately failing the request to open for writing with a sharing violation, the kernel puts the open request on hold to give our program time to clean up and close our handle.

    Finally, our simulated clean-up is complete, and we close the handle. At this point, the kernel allows the command processor to proceed and open the file for writing so it can write hello into it.

    That's the basics of opportunistic locks, but your program will almost certainly not be structured this way. You will probably not wait synchronously on the overlapped I/O but rather have the completion queued up to a completion function, an I/O completion port, or have a thread pool task listen on the event handle. When you do that, remember that you need to keep the OVERLAPPED structure as well as the REQUEST_OPLOCK_INPUT_BUFFER and REQUEST_OPLOCK_OUTUT_BUFFER structures valid until the I/O completes.

    (You may find the Cancel­Io function handy to try to accelerate the clean-up of the file handle and any other actions that are dependent upon it.)

    You can read more about opportunistic locks on MSDN. Note that there are limitations on explicitly-managed opportunistic locks; for example, they don't work across the network.

  • The Old New Thing

    Where did the research project RedShark get its name?

    • 21 Comments

    Project code names are not arrived at by teams of focus groups who carefully parse out every semantic and etymological nuance of the name they choose. (Though if you read the technology press, you'd believe otherwise, because it turns out that taking a code name apart syllable-by-syllable searching for meaning is a great way to fill column-inches.) Usually, they are just spontaneous decisions, inspired by whatever random thoughts jump to mind.

    Many years ago, there was an internal user interface research project code named RedShark. Not Red Shark but RedShark, accent on the Red. Where did this strange name come from?

    From a red shark, of course.

    When the project started up, the people in charge were sitting around and realized they needed to give the project a name. It so happened that the office they were sitting in belonged to a team member who collected a lot of strange toys. One of those toys was an small inflatable red shark.

    Somebody looked around the room and spotted the red shark. "Let's call it RedShark." Nobody else had a better idea, so the name passed by default.

    That small inflatable red shark became their mascot and hung from the ceiling in the hallway.

    No deep, hidden meaning. Just a $3 cheap plastic toy that happened to be in the right place at the right time.

  • The Old New Thing

    How can I figure out which user modified a file?

    • 20 Comments

    The Get­File­Time function will tell you when a file was last modified, but it won't tell you who did it. Neither will Find­First­File, Get­File­Attributes, or Read­Directory­ChangesW, or File­System­Watcher.

    None of these the file system functions will tell you which user modified a file because the file system doesn't keep track of which user modified a file. But there is somebody who does keep track: The security event log.

    To generate an event into the security event log when a file is modified, you first need to enable auditing on the system. In the Local Security Policy administrative tool, go to Local Policies, and then double-click Audit Policy. (These steps haven't changed since Windows 2000; the only thing is that the Administrative Tools folder moves around a bit.) Under Audit Object Access, say that you want an audit raised when access is successfully granted by checking Success (An audited security access attempt that succeeds).

    Once auditing is enabled, you can then mark the files that you want to track modifications to. On the Security tab of each file you are interested in, go to the Auditing page, and select Add to add the user you want to audit. If you want to audit all accesses, then you can choose Everyone; if you are only interested in auditing a specific user or users in specific groups, you can enter the user or group.

    After specifying whose access you want to monitor, you can select what actions should generate security events. In this case, you want to check the Successful box next to Create files / write data. This means "Generate a security event when the user requests and obtains permission to create a file (if this object is a directory) or write data (if this object is a file)."

    If you want to monitor an entire directory, you can set the audit on the directory itself and specify that the audit should apply to objects within the directory as well.

    After you've set up your audits, you can view the results in Event Viewer.

    This technique of using auditing to track who is generating modifications also works for registry keys: Under the Edit menu, select Permissions.

    Exercise: You're trying to debug a problem where a file gets deleted mysteriously, and you're not sure which program is doing it. How can you use this technique to log an event when that specific file gets deleted?

  • The Old New Thing

    Dreaming about games based on Unicode

    • 20 Comments

    I dreamed that two of my colleagues were playing a game based on pantomiming Unicode code points. One of them got LOW QUOTATION MARK, and the other got a variety of ARROW POINTING NORTHEAST, ARROW POINTING EAST, ARROW POINTING SOUTHWEST.

    I wonder how you would pantomime ZERO WIDTH NON-JOINER.

Page 1 of 3 (30 items) 123