August, 2011

  • The Old New Thing

    Stupid command-line trick: Counting the number of lines in stdin


    On unix, you can use wc -l to count the number of lines in stdin. Windows doesn't come with wc, but there's a sneaky way to count the number of lines anyway:

    some-command-that-generates-output | find /c /v ""

    It is a special quirk of the find command that the null string is treated as never matching. The /v flag reverses the sense of the test, so now it matches everything. And the /c flag returns the count.

    It's pretty convoluted, but it does work.

    (Remember, I provide the occasional tip on batch file programming as a public service to those forced to endure it, not as an endorsement of batch file programming.)

    Now come da history: Why does the find command say that a null string matches nothing? Mathematically, the null string is a substring of every string, so it should be that if you search for the null string, it matches everything. The reason dates back to the original MS-DOS version of find.exe, which according to the comments appears to have been written in 1982. And back then, pretty much all of MS-DOS was written in assembly language. (If you look at your old MS-DOS floppies, you'll find that find.exe is under 7KB in size.) Here is the relevant code, though I've done some editing to get rid of distractions like DBCS support.

            mov     dx,st_length            ;length of the string arg.
            dec     dx                      ;adjust for later use
            mov     di, line_buffer
            inc     dx
            mov     si,offset st_buffer     ;pointer to beg. of string argument
            cmp     al,byte ptr [di]
            jnz     no_match
            dec     dx
            jz      a_matchk                ; no chars left: a match!
            call    next_char               ; updates di
            jc      no_match                ; end of line reached
            jmp     comp_next_char          ; loop if chars left in arg.

    If you're rusty on your 8086 assembly language, here's how it goes in pseudocode:

     int dx = st_length - 1;
     char *di = line_buffer;
     char *si = st_buffer;
     char al = *si++;
     if (al != *di) goto no_match;
     if (--dx == 0) goto a_matchk;
     if (!next_char(&di)) goto no_match;
     goto comp_next_char;

    In sort-of-C, the code looks like this:

     int l = st_length - 1;
     char *line = line_buffer;
     char *string = st_buffer;
     while (*string++ == *line && --l && next_char(&line)) {} 

    The weird - 1 followed by l++ is an artifact of code that I deleted, which needed the decremented value. If you prefer, you can look at the code this way:

     int l = st_length;
     char *line = line_buffer;
     char *string = st_buffer;
     while (*string++ == *line && --l && next_char(&line)) {} 

    Notice that if the string length is zero, there is an integer underflow, and we end up reading off the end of the buffers. The comparison loop does stop, because we eventually hit bytes that don't match. (No virtual memory here, so there is no page fault when you run off the end of a buffer; you just keep going and reading from other parts of your data segment.)

    In other words, due to an integer underflow bug, a string of length zero was treated as if it were a string of length 65536, which doesn't match anywhere in the file.

    This bug couldn't be fixed, because by the time you got around to trying, there were already people who discovered this behavior and wrote batch files that relied on it. The bug became a feature.

    The integer underflow was fixed, but the code is careful to treat null strings as never matching, in order to preserve existing behavior.

    Exercise: Why is the loop label called lop instead of loop?

  • The Old New Thing

    Why does the Shift+F10 menu differ from the right-click menu?


    The Shift+F10 key is a keyboard shortcut for calling up the context menu on the selected item. but if you look closely, you might discover that the right-click menu and the Shift+F10 menu differ in subtle ways. Shouldn't they be the same? After all, that's the point of being a keyboard shortcut, right?

    Let's set aside the possibility that a program might be intentionally making them different, in violation of UI guidelines. For example, a poorly-designed program might use the WM_RBUTTON­UP message as the trigger to display the context menu instead of using the WM_CONTEXT­MENU message, in which case Shift+F10 won't do anything at all. Or the poorly-designed program may specifically detect that the WM_CONTEXT­MENU message was generated from the keyboard and choose to display a different menu. (This on top of the common error of forgetting to display a keyboard-invoked context menu at the currently selected item.) If somebody intentionally makes them different, then they'll be different.

    Okay, so the program is not intentionally creating a distinction between mouse-initiated and keyboard-initiated context menus. Shift+F10 and right-click both generate the WM_CONTEXT­MENU message, and therefore the same menu-displaying code is invoked. The subtle difference is that when you press Shift+F10, the shift key is down, and as we all know, holding the shift key while calling up a context menu is a Windows convention for requesting the extended context menu rather than the normal context menu.

    You get a different menu not because the program is going out of its way to show you a different menu, but because the use of the shift key accidentally triggers the extended behavior. It's like why when you look at yourself in the mirror, your eyes are always open, or why when you call your own phone number, the line is always busy. To avoid this, use the Menu key (confusingly given the virtual key name VK_APPS) to call up the context menu. (This is the key that has a picture of a menu on it, usually to the right of your space bar.) When you press that key, the code which decides whether to show a normal or extended context menu will see that the shift key is not held down, and it'll go for the normal context menu.

    Of course, you can also press Shift+AppMenu, but then you'll have come full circle.

  • The Old New Thing

    Microspeak: Dogfood


    Everybody knows about the Microspeak term dogfood. It refers to the practice of taking the product you are working on and using it in production.¹ For the Windows team, it means installing a recent build of Windows on your own computer as well as onto a heavily-used file server. For the Office team, it means using a recent build of Office for all your documents. For the Exchange team, it means moving the entire product team to a server running a recent build of Exchange. You get the idea.

    Purists would restrict the use of the word dogfood to refer to a product group using its own product, but in practice the meaning has been generalized to encompass using a prerelease product in a production environment. The Windows team frequently dogfoods recent builds of Office and Exchange Server. Actually, the Exchange Server case is one of double-dogfood,² for not only is the server running a prerelease version of Exchange Server, it's doing so atop a prerelease version of Windows!

    Dogfooding does have its costs. For example, the prerelease version of Exchange Server might uncover a bug in the prerelease version of Windows. While the problem is investigated, the Windows division can't send email. These outages are comparatively rare, although they are quite frustrating when they occur. But you have to understand that the whole purpose of dogfooding is to find exactly these sorts of problems so that our customers won't!


    ¹ Despite the efforts of our CIO, the term ice-creaming has not caught on.

    ² I made up the term "double-dogfood" just now. It is not part of Microspeak.

  • The Old New Thing

    ReadDirectoryChangesW reads directory changes, but what if the directory doesn't change?


    The Read­Directory­ChangesW function reads changes to a directory. But not all changes that happen to files in a directory result in changes to the directory.

    Now, it so happens that nearly all changes to a file in a directory really do result in something happening to the file's directory entry. If you write to a file, the last-write time in the directory entry changes. If you rename a file, the name in the directory entry changes. If you create a file, a new directory entry is created.

    But there are some changes that do not affect the directory entry. I've heard rumors that if you write to a file via a memory-mapped view, that will not update the last-write time in the directory entry. (I don't know if it's true, but if it's not, then just pick some other file-modifying operation that doesn't affect the directory entry, like modifying the contents of a file through a hard link in another directory, or explicitly suppressing file timestamp changes by calling Set­File­Time with a timestamp of 0xFFFFFFFF`FFFFFFFF.) The point is that since these changes have no effect on the directory, they are not recognized by Read­Directory­ChangesW. The Read­Directory­ChangesW function tells you about changes to the directory; if something happens that doesn't change the directory, then Read­Directory­ChangesW will just shrug its shoulders and say, "Hey, not my job."

    If you need to track all changes, even those which do not result in changes to the directory, you need to look at other techniques like the change journal (a.k.a. USN journal).

    The intended purpose of the Read­Directory­ChangesW function is to assist programs like Windows Explorer which display the contents of a directory. If something happens that results in a change to the directory listing, then it is reported by Read­Directory­ChangesW. In other words, Read­Directory­ChangesW tells you when the result of a Find­First­File/Find­Next­File loop changes. The intended usage pattern is doing a Find­First­File/Find­Next­File to collect all the directory entries, and then using the results from Read­Directory­ChangesW to update that collection incrementally.

    In other words, Read­Directory­ChangesW allows you to optimize a directory-viewing tool so it doesn't have to do full enumerations all the time.

    This design philosophy also explains why, if too many changes have taken place in the directory between calls to Read­Directory­ChangesW, the function will fail with an error called ERROR_NOTIFY_ENUM_DIR. It's telling you, "Whoa, like so much happened that I couldn't keep track of it all, so you'll just have to go back and do another Find­First­File/Find­Next­File loop."

  • The Old New Thing

    An even easier way to get Windows Media Player to single-step a video


    Since my original article explaining how to get Windows Media Player to single-step a video, I've learned that there's an even easier way.

    • Pause the video.
    • To single-step forward, Ctrl+Click the Play button.
    • To single-step backwrd, Ctrl+Shift+Click the Play button.

    Backward-stepping is dependent upon the codec; some of them will go backward to the previous keyframe.

    The person who tipped me off to this feature: The developer who implemented it.

    Remember: Sharing a tip does not imply that I approve of the situation that led to the need for the tip in the first place.

  • The Old New Thing

    Why does creating a shortcut to a file change its last-modified time... sometimes?


    A customer observed that sometimes, the last-modified timestamp on a file would change even though nobody modified the file, or at least consciously took any steps to modify the file. In particular, they found that simply double-clicking the file in Explorer was enough to trigger the file modification.

    It took a while to puzzle out, but here's what's going on:

    When you double-click a file in Explorer, Explorer adds it to the Recent Items list. Internally, this is done by creating a shortcut to the item. The nice thing about a shortcut is that it knows how to track its target. That way, if you move an item, then try to open it from the Recent Items list, the shortcut tracking code will try to find where you moved it to. You moved the file. The shortcut still works. Magic.

    Shortcut target tracking magic is accomplished with the assistance of object identifiers, and object identifiers, as we saw earlier, are created on demand the moment somebody first asks for one.

    And that's where the file modification is coming from. If the file is freshly-created, it won't have an object identifier. When you create a shortcut to it (which happens implicitly when it is added to the Recent Items list), that triggers the creation of an object identifier, which in turn updates the last-modified time on the file.

    Frustratingly, the Link­Resolve­Ignore­Link­Info and No­Resolve­Track policies do not prevent the creation of object identifiers. Those policies control whether the tracking information is used during the resolve process, but they don't control whether the tracking information is obtained during shortcut creation. (Who knows, maybe you're creating the shortcut to be used on a machine where those policies are not in effect.) To suppress collecting the volume information and object identifier at shortcut creation time, you need to pass the SLDF_FORCE_NO_LINKINFO and SLDF_FORCE_NO_LINKTRACK flags to the IShell­Link­Data­List::Set­Flags method when you create the shortcut.

  • The Old New Thing

    How can I get information about the items in the Recycle Bin?


    For some reason, a lot of people are interested in programmatic access to the contents of the Recycle Bin. They never explain why they care, so it's possible that they are looking at their problem the wrong way.

    For example, one reason for asking, "How do I purge an item from the Recycle Bin given a path?" is that some operation in their program results in the files going into the Recycle Bin and they want them to be deleted entirely. The correct solution is to clear the FOF_ALLOW­UNDO flag when deleting the items in the first place. Moving to the Recycle Bin and then purging is the wrong solution because your search-and-destroy mission may purge more items than just the ones your program put there.

    The Recycle Bin is somewhat strange in that it can have multiple items with the same name. Create a text file called TEST.TXT on your desktop, then delete it into the Recycle Bin. Create another text file called TEST.TXT on your desktop, then delete it into the Recycle Bin. Now open your Recycle Bin. Hey look, you have two TEST.TXT files with the same path!

    Now look at that original problem: Suppose the program, as part of some operation, moves the file TEST.TXT from the desktop to the Recycle Bin, and then the second half of the program goes into the Recycle Bin, finds TEST.TXT and purges it. Well, there are actually three copies of TEST.TXT in the Recycle Bin, and only one of them is the one you wanted to purge.

    Okay, I got kind of sidetracked there. Back to the issue of getting information about the items in the Recycle Bin.

    The Recycle Bin is a shell folder, and the way to enumerate the contents of a shell folder is to bind to it and enumerate its contents. The low-level interface to the shell namespace is via IShell­Folder. There is an easier-to-use medium-level interface based on IShell­Item, and there's a high-level interface based on Folder designed for scripting.

    I'll start with the low-level interface. As usual, the program starts with a bunch of header files.

    #include <windows.h>
    #include <stdio.h>
    #include <tchar.h>
    #include <shlobj.h>
    #include <shlwapi.h>
    #include <propkey.h>

    The Bind­To­Csidl function binds to a folder specified by a CSIDL. The modern way to do this is via KNOWN­FOLDER, but just to keep you old fogeys happy, I'm doing things the classic way since you refuse to upgrade from Windows XP. (We'll look at the modern way later.)

    HRESULT BindToCsidl(int csidl, REFIID riid, void **ppv)
     HRESULT hr;
     hr = SHGetSpecialFolderLocation(NULL, csidl, &pidl);
     if (SUCCEEDED(hr)) {
      IShellFolder *psfDesktop;
      hr = SHGetDesktopFolder(&psfDesktop);
      if (SUCCEEDED(hr)) {
       if (pidl->mkid.cb) {
        hr = psfDesktop->BindToObject(pidl, NULL, riid, ppv);
       } else {
        hr = psfDesktop->QueryInterface(riid, ppv);
     return hr;

    The subtlety here is in the test for pidl->mkid.cb. The IShell­Folder::Bind­To­Object method is for binding to child objects (or grandchildren or deeper descendants). If the object you want is the desktop itself, then you can't use IShell­Folder::Bind­To­Object since the desktop is not a child of itself. In fact, if the object you want is the desktop itself, then you already have the desktop, so we just Query­Interface for it. It's an annoying special case which usually lurks in your code until somebody tries something like "Save file to desktop" or changes the location of a special folder to the desktop, and then boom you trip over the fact that the desktop is not a child of itself. (See further discussion below.)

    Another helper function prints the display name of a shell namespace item. There isn't much interesting here either.

    void PrintDisplayName(IShellFolder *psf,
        PCUITEMID_CHILD pidl, SHGDNF uFlags, PCTSTR pszLabel)
     STRRET sr;
     HRESULT hr = psf->GetDisplayNameOf(pidl, uFlags, &sr);
     if (SUCCEEDED(hr)) {
      PTSTR pszName;
      hr = StrRetToStr(&sr, pidl, &pszName);
      if (SUCCEEDED(hr)) {
       _tprintf(TEXT("%s = %s\n"), pszLabel, pszName);

    Our last helper function retrieves a property from the shell namespace and prints it. (Obviously, if we wanted to do something other than print it, we could coerce the type to something other than VT_BSTR.)

    void PrintDetail(IShellFolder2 *psf, PCUITEMID_CHILD pidl,
        const SHCOLUMNID *pscid, PCTSTR pszLabel)
     VARIANT vt;
     HRESULT hr = psf->GetDetailsEx(pidl, pscid, &vt);
     if (SUCCEEDED(hr)) {
      hr = VariantChangeType(&vt, &vt, 0, VT_BSTR);
      if (SUCCEEDED(hr)) {
       _tprintf(TEXT("%s: %ws\n"), pszLabel, V_BSTR(&vt));

    Okay, now we can get down to business. The properties we will display from each item in the Recycle Bin are the item name and path, the original location (before the item was deleted), the date the item was deleted, and the size of the item.

    Getting the name and path are done with various combinations of flags to IShell­Folder::Get­Display­Name­Of, whereas getting the other properties involve talking to the shell property system. (My colleague Ben Karas covers the shell property system on his blog.) The SHCOLUMN­ID documentation says that the displaced property set applies to items which have been moved to the Recycle Bin, so we can define those column IDs based on the values provided in shlguid.h:

    const SHCOLUMNID SCID_OriginalLocation =
    const SHCOLUMNID SCID_DateDeleted =

    The other property we want is System.Size, which the documentation says is defined as PKEY_Size by the propkey.h header file.

    Okay, let's roll!

    int __cdecl _tmain(int argc, PTSTR *argv)
     HRESULT hr = CoInitialize(NULL);
     if (SUCCEEDED(hr)) {
      IShellFolder2 *psfRecycleBin;
      hr = BindToCsidl(CSIDL_BITBUCKET, IID_PPV_ARGS(&psfRecycleBin));
      if (SUCCEEDED(hr)) {
       IEnumIDList *peidl;
       hr = psfRecycleBin->EnumObjects(NULL,
       if (hr == S_OK) {
        PITEMID_CHILD pidlItem;
        while (peidl->Next(1, &pidlItem, NULL) == S_OK) {
         PrintDisplayName(psfRecycleBin, pidlItem,
                          SHGDN_INFOLDER, TEXT("InFolder"));
         PrintDisplayName(psfRecycleBin, pidlItem,
                          SHGDN_NORMAL, TEXT("Normal"));
         PrintDisplayName(psfRecycleBin, pidlItem,
                          SHGDN_FORPARSING, TEXT("ForParsing"));
         PrintDetail(psfRecycleBin, pidlItem,
                     &SCID_OriginalLocation, TEXT("Original Location"));
         PrintDetail(psfRecycleBin, pidlItem,
                     &SCID_DateDeleted, TEXT("Date deleted"));
         PrintDetail(psfRecycleBin, pidlItem,
                     &PKEY_Size, TEXT("Size"));
     return 0;

    The only tricky part is the test for whether the call to IShell­Folder::Enum­Objects succeeded, highlighted above. According to the rules for IShell­Folder::Enum­Objects, the method is allowed to return S_FALSE to indicate that there are no children, in which case it sets peidl to NULL.

    If you are willing to call functions new to Windows Vista, you can simplify the Bind­To­Csidl function by using the helper function SHBind­To­Object. This does the work of getting the desktop folder and handling the desktop special case.

    HRESULT BindToCsidl(int csidl, REFIID riid, void **ppv)
     HRESULT hr;
     hr = SHGetSpecialFolderLocation(NULL, csidl, &pidl);
     if (SUCCEEDED(hr)) {
      hr = SHBindToObject(NULL, pidl, NULL, riid, ppv);
     return hr;

    But at this point, I'm starting to steal from the topic I scheduled for next time, namely modernizing this program to take advantage of some new helper functions and interfaces. We'll continue next time.

  • The Old New Thing

    A shell extension is a guest in someone else's house; don't go changing the code page


    A customer reported a problem with their shell extension:

    We want to format a floating point number according to the user's default locale. We do this by calling snprintf to convert the value from floating point to text with a period (U+002E) as the decimal separator, then using Get­Number­Format to apply the user's preferred grouping character, decimal separator, etc. We found, however, that if the user is running in (say) German, we find that sometimes (but not always) the snprintf function follows the German locale and uses a comma (U+002C) as the decimal separator with no thousands separator. This format prevents the Get­Number­Format function from working, since it requires the decimal separator to be U+002E. What is the recommended way of formatting a floating point number according to the user's locale?

    The recommended way of formatting a floating point number according to the user's locale is indeed to use a function like snprintf to convert it to text with U+002E as the decimal separator (and other criteria), then use Get­Number­Format to apply the user's locale preferences.

    The snprintf function follows the C/C++ runtime locale to determine how the floating point number should be converted, and the default C runtime locale is the so-called "C" locale which indeed uses U+002E as the decimal separator. Since you're getting U+002C as the decimal separator, somebody must have called set­locale to change the locale from "C" to a German locale, most likely by passing "" as the locale, which means "follow the locale of the environment."

    Our shell extension is running in Explorer. Under what conditions will Explorer call set­locale(LC_NUMERIC, "")? What should we do if the locale is not "C"?

    As it happens, Explorer never calls set­locale. It leaves the locale set to the default value of "C". Therefore, the call to snprintf should have generated a string with U+002E as the decimal separator. Determining who was calling set­locale was tricky since the problem was intermittent, but after a lot of work, we found the culprit: some other shell extension loaded before the customer's shell extension and decided to change the carpet by calling set­locale(LC_ALL, "") in its DLL_PROCESS_ATTACH, presumably so that its calls to snprintf would follow the environment locale. What made catching the miscreant more difficult was that the rogue shell extension didn't restore the locale when it was unloaded (not that that would have been the correct thing to do either), so by the time the bad locale was detected, the culprit was long gone!

    That other DLL used a global setting to solve a local problem. Given the problem "How do I get my calls to snprintf to use the German locale settings?" they decided to change all calls to snprintf to use the German locale settings, even the calls that didn't originate from the DLL itself. What if the program hosting the shell extension had done a set­locale(LC_ALL, "French")? Tough noogies; the rogue DLL just screwed up the host program, which wants to use French locale settings but is now being forced to use German ones. The program probably won't notice that somebody secretly replaced its coffee with Folgers Crystals. It'll be a client who notices that the results are not formatted correctly. The developers of the host program, of course, won't be able to reproduce the problem in their labs, since they don't have the rogue shell extension, and the problem will be classified as "unsolved."

    What both the rogue shell extension and the original customer's shell extension should be using is the _l variety of string formatting functions (in this case _snprintf_l, although _snprintf_s_l is probably better). The _l variety lets you pass an explicit locale which will be used to format that particular string. (You create one of these _locale_t objects by calling _create_locale with the same parameters you would have passed to set­locale.) Using the _l technique solves two problems:

    1. It lets you apply a local solution to a local problem. The locale you specify applies only to the specific call; the process's default locale remains unchanged.
    2. It allows you to ensure that you get the locale you want even if the host process has set a different locale.

    If either the customer's DLL or the rogue DLL had followed this principle of not using a global setting to solve a local problem, the conflict would not have arisen.

  • The Old New Thing

    Random musings on the introduction of long file names on FAT


    Tom Keddie thinks that the format of long file names on FAT deserves an article. Fortunately, I don't have to write it; somebody else already did.

    So go read that article first. I'm just going to add some remarks and stories.

    Hi, welcome back.

    Coming up with the technique of setting Read-only, System, Hidden, and Volume attributes to hide LFN entries took a bit of trial and error. The volume label was the most important part, since that was enough to get 90% of programs which did low-level disk access to lay off those directory entries. The other bits were added to push the success rate ever so close to 100%.

    The linked article mentions rather briefly that the checksum is present to ensure that the LFN entries correspond to the SFN entry that immediately follows. This is necessary so that if the directory is modified by code that is not LFN-aware (for example, maybe you dual-booted into Windows 3.1), and the file is deleted and the directory entry is reused for a different file, the LFN fragments won't be erroneously associated with the new file. Instead, the fragments are "orphans", directory entries for which the corresponding SFN entry no longer exists. Orphaned directory entries are treated as if they were free.

    The cluster value in a LFN entry is always zero for compatibility with disk utilities who assume that a nonzero cluster means that the directory entry refers to a live file.

    The linked article wonders what happens if the ordinals are out of order. Simple: If the ordinals are out of order, then they are invalid. The file system simply treats them as orphans. Here's an example of how out-of-order ordinals can be created. Start with the following directory entries:

    (2) "e.txt"
    (1) "Long File Nam"
    (2) "e2.txt"
    (1) "Long File Nam"

    Suppose this volume is accessed by a file system that does not support long file names, and the user deletes LONGFI~1.TXT. The directory now looks like this:

    (2) "e.txt"
    (1) "Long File Nam"
    (2) "e2.txt"
    (1) "Long File Nam"

    Now the volume is accessed by a file system that supports long file names, and the user renames Long File Name2.txt to Wow that's a really long file name there.txt.

    (2) "e.txt"
    (4) "e.txt"
    (3) "ile name ther"
    (2) "really long f"
    (1) "Wow that's a "

    Since the new name is longer than the old name, more LFN fragments need to be used to store the entire name, and oh look isn't that nice, there are some free entries right above the ones we're already using, so let's just take those. Now if you read down the table, you see that the ordinal goes from 2 up to 4 (out of order) before continuing in the correct order. When the file system sees this, it knows that the entry with ordinal 2 is an orphan.

    One last historical note: The designers of this system didn't really expect Windows NT to adopt long file names on FAT, since Windows NT already had its own much-better file system, namely, NTFS. If you wanted long file names on Windows NT, you'd just use NTFS and call it done. Nevertheless, the decision was made to store the file names in Unicode on disk, breaking with the long-standing practice of storing FAT file names in the OEM character set. The decision meant that long file names would take up twice as much space (and this was back in the days when disk space was expensive), but the designers chose to do it anyway "because it's the right thing to do."

    And then Windows NT added support for long file names on FAT and the decision taken years earlier to use Unicode on disk proved eerily clairvoyant.

  • The Old New Thing

    Why are the alignment requirements for SLIST_ENTRY so different on 64-bit Windows?


    The Interlocked­Push­Entry­SList function stipulates that all list items must be aligned on a MEMORY_ALLOCATION_ALIGNMENT boundary. For 32-bit Windows, MEMORY_ALLOCATION_ALIGNMENT is 8, but the SLIST_ENTRY structure itself does not have a DECLSPEC_ALIGN(8) attribute. Even more confusingly, the documentation for SLIST_ENTRY says that the 64-bit structure needs to be 16-byte aligned but says nothing about the 32-bit structure. So what are the memory alignment requirements for a 32-bit SLIST_ENTRY, 8 or 4?

    It's 8. No, 4. No wait, it's both.

    Officially, the alignment requirement is 8. Earlier versions of the header file did not stipulate 8-byte alignment, and changing the declaration would have resulted in existing structures which (inadvertently) misaligned the field changing size and layout when the new requirement was imposed. So the 32-bit structure was sort-of grandfathered in. You should still align it on 8-byte boundaries, but the header file doesn't enforce it to avoid breaking existing code.

    Fortunately, when the 64-bit version was introduced, the proper alignment directive was introduced right off the bat. How about that: sometimes Microsoft learns from its mistakes after all.

    Why are the alignment requirements greater than the natural word size? To avoid the ABA problem. A standard workaround for the ABA problem is to append additional information (a "tag") to the pointer so that when the value changes from B back to A, the tag ensures that the second A still looks different from the first one. Many CPU architectures have a "double-pointer-sized atomic compare-and-swap" instruction, and some of them have the additional requirement that the double-pointer needs to be on a double-pointer boundary (8 bytes for 32-bit pointers and 16 bytes for 64-bit pointers).

    "But wait, the double-pointer compare-and-swap is used on the SLIST_HEADER, not on the SLIST_ENTRY. Why does the SLIST_ENTRY need to be double-pointer aligned, too?"

    While it's true that many CPU architectures have a "double-pointer-sized atomic compare-and-swap" instruction, some support only a "pointer-sized atomic compare-and-swap". For example, the original AMD64 architecture did not have a CMPXCHG16B instruction; the largest data size for an atomic compare-and-swap was 8 bytes. As a result, the Slist functions need to pack a 64-bit pointer, a list depth, and tag information into a single 64-bit value. One of the tricks they used was imposing a memory alignment of 16 bytes. This freed up four bits in the pointer for use as a tag.

Page 1 of 3 (26 items) 123