• The Old New Thing

    That doesn't sound like South Frisian to me

    • 4 Comments

    I dreamed that I was back in college taking a course in South Frisian, but I suspected something was up because the words didn't sound Germanic at all, and we were taught the words to a Christmas carol as Nom Yom Hear What I Hear?

    Also, because the course was taught by known prevaricator/exaggerator Robert Irvine.

  • The Old New Thing

    On the various ways of creating large files in NTFS

    • 12 Comments

    For whatever reason, you may want to create a large file.

    The most basic way of doing this is to use Set­File­Pointer to move the pointer to a large position into the file (that doesn't exist yet), then use Set­End­Of­File to extend the file to that size. This file has disk space assigned to it, but NTFS doesn't actually fill the bytes with zero yet. It will do that lazily on demand. If you intend to write to the file sequentially, then that lazy extension will not typically be noticeable because it can be combined with the normal writing process (and possibly even optimized out). On the other hand, if you jump ahead and write to a point far past the previous high water mark, you may find that your single-byte write lasts forever.

    Another option is to make the file sparse. I refer you to the remarks I made some time ago on the pros and cons of this technique. One thing to note is that when a file is sparse, the virtual-zero parts do not have physical disk space assigned to them. Consequently, it's possible for a Write­File into a previously virtual-zero section of the file may fail with an ERROR_DISK_QUOTA_EXCEEDED error.

    Yet another option is to use the Set­File­Valid­Data function. This tells NTFS to go grab some physical disk space, assign it to the file, and to set the "I already zero-initialized all the bytes up to this point" value to the file size. This means that the bytes in the file will contain uninitialized garbage, and it also poses a security risk, because somebody can stumble across data that used to belong to another user. That's why Set­File­Valid­Data requires administrator privileges.

    From the command line, you can use the fsutil file setvaliddata command to accomplish the same thing.

    Bonus chatter: The documentation for Set­End­Of­File says, "If the file is extended, the contents of the file between the old end of the file and the new end of the file are not defined." But I just said that it will be filled with zero on demand. Who is right?

    The formal definition of the Set­End­Of­File function is that the extended content is undefined. However, NTFS will ensure that you never see anybody else's leftover data, for security reasons. (Assuming you're not intentionally bypassing the security by using Set­File­Valid­Data.)

    Other file systems, however, may choose to behave differently.

    For example, in Windows 95, the extended content is not zeroed out. You will get random uninitialized junk that happens to be whatever was lying around on the disk at the time.

    If you know that the file system you are using is being hosted on a system running some version of Windows NT (and that the authors of the file system passed their Common Criteria security review), then you can assume that the extra bytes are zero. But if there's a chance that the file is on a computer running Windows for Workgroups or Windows 95, then you need to worry about those extra bytes. (And if the file system is hosted on a computer running a non-Windows operating system, then you'll have to check the documentation for that operating system to see whether it guarantees zeroes when files are extended.)

    [Raymond is currently away; this message was pre-recorded.]

  • The Old New Thing

    The SuspendThread function suspends a thread, but it does so asynchronously

    • 22 Comments

    Prologue: Why you should never suspend a thread.

    Okay, so a colleague decided to ignore that advice because he was running some experiments with thread safety and interlocked operations, and suspending a thread was a convenient way to open up race windows.

    While running these experiments, he observed some strange behavior.

    LONG lValue;
    
    DWORD CALLBACK IncrementerThread(void *)
    {
     while (1) {
      InterlockedIncrement(&lValue);
     }
     return 0;
    }
    
    // This is just a test app, so we will abort() if anything
    // happens we don't like.
    
    int __cdecl main(int, char **)
    {
     DWORD id;
     HANDLE thread = CreateThread(NULL, 0, IncrementerThread, NULL, 0, &id);
     if (thread == NULL) abort();
    
     while (1) {
      if (SuspendThread(thread) == (DWORD)-1) abort();
    
      if (InterlockedOr(&lValue, 0) != InterlockedOr(&lValue, 0)) {
       printf("Huh? The variable lValue was modified by a suspended thread?\n");
      }
    
      ResumeThread(thread);
     }
     return 0;
    }
    

    The strange thing is that the "Huh?" message was being printed. How can a suspended thread modify a variable? Is there some way that Interlocked­Increment can start incrementing a variable, then get suspended, and somehow finish the increment later?

    The answer is simpler than that. The Suspend­Thread function tells the scheduler to suspend the thread but does not wait for an acknowledgment from the scheduler that the suspension has actually occurred. This is sort of alluded to in the documentation for Suspend­Thread which says

    This function is primarily designed for use by debuggers. It is not intended to be used for thread synchronization

    You are not supposed to use Suspend­Thread to synchronize two threads because there is no actual synchronization guarantee. What is happening is that the Suspend­Thread signals the scheduler to suspend the thread and returns immediately. If the scheduler is busy doing something else, it may not be able to handle the suspend request immediately, so the thread being suspended gets to run on borrowed time until the scheduler gets around to processing the suspend request, at which point it actually gets suspended.

    If you want to make sure the thread really is suspended, you need to perform a synchronous operation that is dependent on the fact that the thread is suspended. This forces the suspend request to be processed since it is a prerequisite for your operation, and since your operation is synchronous, you know that by the time it returns, the suspend has definitely occurred.

    The traditional way of doing this is to call Get­Thread­Context, since this requires the kernel to read from the context of the suspended thread, which has as a prerequisite that the context be saved in the first place, which has as a prerequisite that the thread be suspended.

  • The Old New Thing

    File version information does not appear in the property sheet for some files

    • 26 Comments

    A customer reported that file version information does not appear on the Details page of the property sheet which appears when you right-click the file and select Properties. They reported that the problem began in Windows 7.

    The reason that the file version information was not appearing is that the file's extension was .xyz. Older versions of Windows attempted to extract file version information for all files regardless of type. I believe it was Windows Vista that changed this behavior and extracted version information only for known file types for Win32 modules, specifically .cpl, .dll, .exe, .ocx, .rll, and .sys. If the file's extension is not on the list above, then the shell will not sniff for version information.

    If you want to register a file type as eligible for file version extraction, you can add the following registry key:

    HKEY_LOCAL_MACHINE
     \Software
      \Microsoft
        \Windows
          \CurrentVersion
            \PropertySystem
              \PropertyHandlers
                \.XYZ
                 (Default) = REG_SZ:"{66742402-F9B9-11D1-A202-0000F81FEDEE}"
    

    (Thanks in advance for complaining about this change in behavior. This always happens whenever I post in the Tips/Support category about how to deal with a bad situation. Maybe I should stop trying to explain how to deal with bad situations.)

  • The Old New Thing

    Why does the access violation error message put the operation in quotation marks? Is is some sort of euphemism?

    • 24 Comments

    When an application crashes with an access violation, the error message says something like

    The instruction at "XX" referenced memory at "YY". The memory could not be "read".

    Why is the operation in quotation marks? Is this some sort of euphemism?

    The odd phrasing is a consequence of globalization. The operation name is a verb in the infinitive ("read", "write"), but depending on how the containing message is localized, it may need to take a different form. Since the kernel doesn't understand grammar, it just puts the words in quotation marks to avoid having to learn every language on the planet. Imagine if it tried:

    The memory could not be readed.

    The kernel tried to form the passive, which is normally done in English by adding "–ed" to the end of the verb. Too bad "read" and "write" are irregular verbs!

    The more conventional solution for this type of problem is to create a separate error message for each variant so that the text can be translated independently. rather than building sentences at runtime,

    The access violation error message is in a pickle, though, because the underlying status code is STATUS_ACCESS_VIOLATION, and that message contains three insertions, one for the instruction address, one for the address being accessed, and one for the operation. If there were three different status codes, like STATUS_ACCESS_VIOLATION_READ, STATUS_ACCESS_VIOLATION_WRITE, and STATUS_ACCESS_VIOLATION_EXECUTE, then a separate string could be created for each. But that's not how the status codes folks decided to do things, and the translation team was stuck having to use the ugly quotation marks.

  • The Old New Thing

    What does INIT_ONCE_CTX_RESERVED_BITS mean?

    • 11 Comments

    Windows Vista adds the One-Time Initialization family of functions which address a common coding pattern: I want a specific chunk of code to run exactly once, even in the face of multiple calls from different threads. There are many implementations of this pattern, such as the infamous double-checked lock. The double-checked lock is very easy to get wrong, due to memory ordering and race conditions, so the kernel folks decided to write it for you.

    The straightforward way of using a one-time-initialization object is to have it protect the initialization of some other object. For example, you might have it protect a static object:

    INIT_ONCE GizmoInitOnce = INIT_ONCE_STATIC_INIT;
    Gizmo ProtectedGizmo;
    
    BOOL CALLBACK InitGizmoOnce(
        PINIT_ONCE InitOnce,
        PVOID Parameter,
        PVOID *Context)
    {
        Gizmo *pGizmo = reinterpret_cast<Gizmo*>(Parameter);
        pGizmo->Initialize();
        return TRUE;
    }
    
    SomeFunction(...)
    {
        // Initialize ProtectedGizmo if not already initialized
        InitOnceExecuteOnce(&GizmoInitOnce,
                            InitGizmoOnce,
                            &ProtectedGizmo,
                            NULL);
    
        // At this point, ProtectedGizmo has been initialized
        ProtectedGizmo.Something();
        ...
    }
    

    Or you might have it protect a dynamic object:

    class Widget
    {
        Widget()
        {
            InitOnceInitialize(&m_InitOnce);
        }
    
        void Initialize();
    
        ...
    
        static BOOL CALLBACK InitWidgetOnce(
            PINIT_ONCE InitOnce,
            PVOID Parameter,
            PVOID *Context)
        {
            Widget *pWidget = reinterpret_cast<Widget*>(Parameter);
            pWidget->Initialize();
            return TRUE;
        }
    
        SomeMethod(...)
        {
            // Initialize ourselves if not already initialized
            InitOnceExecuteOnce(&InitWidgetOnce,
                                this,
                                NULL);
    
            // At this point, we have been initialized
            ... some other stuff ...
        }
    }
    

    But it so happens that you can also have the INIT_ONCE object protect itself.

    You see, once the INIT_ONCE object has entered the "initialization complete" state, the one-time initialization code only needs a few bits of state. The other bits are unused, so the kernel folks figured, "Well, since we're not using them, maybe the application wants to use them."

    That's where INIT_ONCE_CTX_RESERVED_BITS comes in. The INIT_ONCE_CTX_RESERVED_BITS value is the number of bits that the one-time initialization code uses after initialization is complete; the other bits are free for you to use yourself. The value of INIT_ONCE_CTX_RESERVED_BITS is 2, which means that you can store any value that's a multiple of 4. If it's a pointer, then the pointer must be DWORD-aligned or better. This requirement is usually easy to meet because heap-allocated objects satisfy it, and the pointer you want to store is usually a pointer to a heap-allocated object. As noted some time ago, kernel object handles are also multiples of four, so those can also be safely stored inside the INIT_ONCE object. (On the other hand, USER and GDI handles are not guaranteed to be multiples of four, so you cannot use this trick to store those types of handles.)

    Here's an example. First, the code which uses the traditional method of having the INIT_ONCE structure protect another variable:

    // using the static object pattern for simplicity
    
    INIT_ONCE PathInitOnce = INIT_ONCE_STATIC_INIT;
    LPWSTR PathToDatabase = NULL;
    
    BOOL CALLBACK InitPathOnce(
        PINIT_ONCE InitOnce,
        PVOID Parameter,
        PVOID *Context)
    {
        LPWSTR Path = (LPWSTR)LocalAlloc(LMEM_FIXED, ...);
        if (Path == NULL) return FALSE;
        ... get the path in Path...
        PathToDatabase = Path;
        return TRUE;
    }
    
    SomeFunction(...)
    {
        // Get the database path (initializing if necessary)
        if (!InitOnceExecuteOnce(&PathInitOnce,
                                 InitPathOnce,
                                 NULL,
                                 NULL)) {
            return FALSE; // couldn't get the path for some reason
        }
    
        // The "PathToDatabase" variable now contains the path
        // computed by InitPathOnce.
    
        OtherFunction(PathToDatabase);
        ...
    }
    

    Since the object being protected is pointer-sized and satisfies the necessary alignment constraints, we can merge it into the INIT_ONCE structure.

    INIT_ONCE PathInitOnce = INIT_ONCE_STATIC_INIT;
    
    BOOL CALLBACK InitPathOnce(
        PINIT_ONCE InitOnce,
        PVOID Parameter,
        PVOID *Context)
    {
        LPWSTR Path = (LPWSTR)LocalAlloc(LMEM_FIXED, ...);
        if (Path == NULL) return FALSE;
        ... get the path in Path...
        *Context = Path;
        return TRUE;
    }
    
    SomeFunction(...)
    {
        LPWSTR PathToDatabase;
        // Get the database path (initializing if necessary)
        if (!InitOnceExecuteOnce(&PathInitOnce,
                                 InitPathOnce,
                                 NULL,
                                 &PathToDatabase)) {
            return FALSE; // couldn't get the path for some reason
        }
    
        // The "PathToDatabase" variable now contains the path
        // computed by InitPathOnce.
    
        OtherFunction(PathToDatabase);
        ...
    }
    

    This may seem like a bunch of extra work to save four bytes (or eight bytes on 64-bit Windows), but if you use the asynchronous initialization model, then you have no choice but to use context-based initialization, as we learned when we tried to write our own lock-free one-time initialization code.

  • The Old New Thing

    Foiled by my withered hand

    • 16 Comments

    A few years ago, some email was sent out to the product team asking for a volunteer hand model to demonstrate how to open the Windows Vista box. Alas, I withdrew myself from consideration due to my withered hand.

  • The Old New Thing

    iPhone pricing as economic experiment

    • 42 Comments

    Back in 2005, Slate's Tim Harford wondered why Microsoft didn't raise the introductory price of Xbox 360 game consoles. With the price set at $300, lines were long and shortages were many. Harford's readers came up with their own theories for resisting the laws of supply and demand and holding to a fixed price.

    The Xbox 360 is hardly unique in this respect. When there's a hot product, manufacturers hold to the original price and let the lines grow, the shortages fester, and the customers get more frustrated. Think Tickle Me Elmo or Cabbage Patch Kids. Even though from an economic-theoretical standpoint, a product that has sold out with unmet demand is a product whose price was set too low.

    With the iPhone, Apple unwittingly ran the experiment that Harford proposed. There were lines, but by some reports, the lines weren't all that bad. After the initial demand subsided, Apple did what the economists say they should have done: They lowered the price. And the people who bought the phones at the higher price complained (forcing Apple to offer a store credit) and one of them even sued. Slate's Daniel Gross opines on the lessons learned.

  • The Old New Thing

    Why don't music files show up in my Recent Items list?

    • 19 Comments

    If you double-click a music file, it doesn't show up in your Recent Items list. What's so special about music files?

    The technical reason is that the file types are registered with the FTA_No­Recent­Docs flag, which means that they don't show up in the Recent Items list (formerly known as Recent Documents), and they don't show up in the Recent or Frequent section of Windows Media Player's Jump List.

    Okay, fine, but that's like answering "Why is there a door here?" with "Because the blueprints said that there should be a door there." You really want to know why the architect decided to put a door there.

    The reason why music files are not placed in the Recent Items list is that individual songs are not that interesting there. If you spend an hour listening to music, that will fill up your Recent Items list and push out everything else. The list ends up simply telling you about the thing you just finished doing. Not all that useful.

    Therefore, Windows Media Player excludes individual music files from the Recent Items list. Instead, it puts artists and albums in the Jump List, higher-level information that they figure is more interesting than just a list of the songs you just heard.

  • The Old New Thing

    Even the trees are falling for the media's lies

    • 2 Comments

    The White House is doing some renovating of its own (NYT, free registration required), in order to refurbish a section of the lawn that is used by television reporters so they can stand there with a nice picture of the White House behind them.

    But that's not why I found this story interesting.

    Go about halfway down the article: "The light and heat from the television lights, sometimes 14 hours a day, were confusing the trees", causing them to sprout leaves in winter and flower out of season.

    Even the trees are falling for the media's lies.

Page 378 of 458 (4,573 items) «376377378379380»