• The Old New Thing

    When you think you found a problem with a function, make sure you're actually calling the function, episode 2

    • 16 Comments

    A customer reported that the Duplicate­Handle function was failing with ERROR_INVALID_HANDLE even though the handle being passed to it seemed legitimate:

      // Create the handle here
      m_Event =
        ::CreateEvent(NULL, FALSE/*bManualReset*/,
                           FALSE/*bInitialState*/, NULL/*lpName*/));
      ... error checking removed ...
    
    
    // Duplicate it here
    HRESULT MyClass::CopyTheHandle(HANDLE *pEvent)
    {
     HRESULT hr = S_OK;
     
     if (m_Event != NULL) {
      BOOL result = ::DuplicateHandle(
                    GetCurrentProcess(),
                    m_Event,
                    GetCurrentProcess(),
                    pEvent,
                    0,
                    FALSE,
                    DUPLICATE_SAME_ACCESS
                    );
      if (!result) {
        // always fails with ERROR_INVALID_HANDLE
        return HRESULT_FROM_WIN32(GetLastError());
      }
     } else {
      *pEvent = NULL;
     }
     
     return hr;
    }
    

    The handle in m_Event appears to be valid. It is non-null, and we can still set and reset it. But we can't duplicate it.

    Now, before claiming that a function doesn't work, you should check what you're passing to it and what it returns. The customer checked the m_Event parameter, but what about the other parameters? The function takes three handle parameters, after all, and they checked only one of them. According to the debugger, Duplicate­Handle was called with the parameters

    hSourceProcessHandle  = 0x0aa15b80
    hSourceHandle  = 0x00000ed8 m_Event, appears to be valid
    hTargetProcessHandle  = 0x0aa15b80
    lpTargetHandle  = 0x00b0d914
    dwDesiredAccess  = 0x00000000
    bInheritHandle  = 0x00000000
    dwOptions  = 0x00000002

    Upon sharing this information, the customer immediately saw the problem: The other two handle parameters come from the Get­Current­Process function, and that function was returning 0x0aa15b80 rather than the expected pseudo-handle (which is currently -1, but that is not contractual).

    The customer explained that their My­Class has a method with the name Get­Current­Process, and it was that method which was being called rather than the Win32 function Get­Current­Process. They left off the leading :: and ended up calling the wrong Get­Current­Process.

    By default, Visual Studio colors member functions and global functions the same, but you can change this in the Fonts and Colors options dialog. Under Show settings for, select Text Editor, and then under Display items you can customize the colors to use for various language elements. In particular, you can choose a special color for static and instance member functions.

    Or, as a matter of style, you could have a policy of not giving member functions the same name as global functions. (This has the bonus benefit of reducing false positives when grepping.)

    Bonus story: A different customer reported a problem with visual styles in the common tab control. After a few rounds of asking questions, coming up with theories, testing the theories, disproving the theories, the customer wrote back: "We figured out what was happening when we tried to step into the call to Create­Dialog­Indirect­ParamW. Someone else in our code base redefined all the dialog creation functions in an attempt to enforce a standard font on all of them, but in doing so, they effectively made our code no longer isolation aware, because in the overriding routines, they called Create­Dialog­Indirect­ParamW instead of Isolation­Aawre­Create­Dialog­Indirect­ParamW. Thanks for all the help, and apologies for the false alarm."

  • The Old New Thing

    Please enjoy the new eco-friendly printers, now arguably less eco-friendly

    • 27 Comments

    Some years ago, the IT department replaced the printers multifunction devices with new reportedly eco-friendly models. One feature of the new devices is that when you send a job to the printer, it doesn't print out immediately. Printing doesn't begin until you go to the device and swipe your badge through the card reader. The theory here is that this cuts down on the number of forgotten or accidental printouts, where you send a job to a printer and forget to pick it up, or you click the Print button by mistake. If a job is not printed within a few days, it is automatically deleted.

    The old devices already supported secured printing, where the job doesn't come out of the printer until you go to the device and release the job. But with this change, secured printing is now mandatory. Of course, this means that even if you weren't printing something sensitive, you still have to stand there and wait for your document to print instead of having the job already completed and waiting for you.

    The new printing system also removes the need for job separator pages. Avoiding banner pages and eliminating forgotten print jobs are touted as the printer's primary eco-friendly features.

    Other functions provided by the devices are photocopying and scanning. With the old devices, you place your document on the glass or in the document hopper, push the Scan button, and the results are emailed to you. With the new devices, you place your document on the glass or in the document hopper, push the Scan button, and the results are emailed to you, plus a confirmation page is printed out.

    Really eco-friendly service there, printing out confirmation pages for every scanning job.

    The problem was fixed a few weeks later.

    Bonus chatter: Our fax machines also print confirmation pages, or at least they did the last time I used one, many years ago.

  • The Old New Thing

    How can I detect whether a keyboard is attached to the computer?

    • 24 Comments

    Today's Little Program tells you whether a keyboard is attached to the computer. The short answer is "Enumerate the raw input devices and see if any of them is a keyboard."

    Remember: Little Programs don't worry about silly things like race conditions.

    #include <windows.h>
    #include <iostream>
    #include <vector>
    #include <algorithm>
    
    bool IsKeyboardPresent()
    {
     UINT numDevices = 0;
      if (GetRawInputDeviceList(nullptr, &numDevices,
                                sizeof(RAWINPUTDEVICELIST)) != 0) {
       throw GetLastError();
     }
    
     std::vector<RAWINPUTDEVICELIST> devices(numDevices);
    
     if (GetRawInputDeviceList(&devices[0], &numDevices,
                               sizeof(RAWINPUTDEVICELIST)) == (UINT)-1) {
      throw GetLastError();
     }
    
     return std::find_if(devices.begin(), devices.end(),
        [](RAWINPUTDEVICELIST& device)
        { return device.dwType == RIM_TYPEKEYBOARD; }) != devices.end();
    }
    
    int __cdecl main(int, char**)
    {
     std::cout << IsKeyboardPresent() << std::endl;
     return 0;
    }
    

    There is a race condition in this code if the number of devices changes between the two calls to Get­Raw­Input­Device­List. I will leave you to fix it before incorporating this code into your program.

  • The Old New Thing

    What did the Ignore button do in Windows 3.1 when an application encountered a general protection fault?

    • 37 Comments

    In Windows 3.0, when an application encountered a general protection fault, you got an error message that looked like this:

    Application error
    CONTOSO caused a General Protection Fault in
    module CONTOSO.EXE at 0002:2403
    Close

    In Windows 3.1, under the right conditions, you would get a second option:

    CONTOSO
    An error has occurred in your application.
    If you choose Ignore, you should save your work in a new file.
    If you choose Close, your application will terminate.
    Close
    Ignore

    Okay, we know what Close does. But what does Ignore do? And under what conditions will it appear?

    Roughly speaking, the Ignore option becomes available if

    • The fault is a general protection fault,
    • The faulting instruction is not in the kernel or the window manager,
    • The faulting instruction is one of the following, possibly with one or more prefix bytes:
      • Memory operations: op r, m; op m, r; or op m.
      • String memory operations: movs, stos, etc.
      • Selector load: lds, les, pop ds, pop es.

    If the conditions are met, then the Ignore option became available. If you chose to Ignore, then the kernel did the following:

    • If the faulting instruction is a selector load instruction, the destination selector register is set to zero.
    • If the faulting instruction is a pop instruction, the stack pointer is incremented by two.
    • The instruction pointer is advanced over the faulting instruction.
    • Execution is resumed.

    In other words, the kernel did the assembly language equivalent of ON ERROR RESUME NEXT.

    Now, your reaction to this might be, "How could this possibly work? You are just randomly ignoring instructions!" But the strange thing is, this idea was so crazy it actually worked, or at least worked a lot of the time. You might have to hit Ignore a dozen times, but there's a good chance that eventually the bad values in the registers will get overwritten by good values (and it probably won't take long because the 8086 has so few registers), and the program will continue seemingly-normally.

    Totally crazy.

    Exercise: Why didn't the code have to know how to ignore jump instructions and conditional jump instructions?

    Bonus trivia: The developer who implemented this crazy feature was Don Corbitt, the same developer who wrote Dr. Watson.

  • The Old New Thing

    Why do I get ERROR_INVALID_HANDLE from GetModuleFileNameEx when I know the process handle is valid?

    • 18 Comments

    Consider the following program:

    #define UNICODE
    #define _UNICODE
    #include <windows.h>
    #include <psapi.h>
    #include <stdio.h> // horrors! mixing C and C++!
    
    int __cdecl wmain(int, wchar_t **)
    {
     STARTUPINFO si = { sizeof(si) };
     PROCESS_INFORMATION pi;
     wchar_t szBuf[MAX_PATH] = L"C:\\Windows\\System32\\notepad.exe";
    
     if (CreateProcess(szBuf, szBuf, NULL, NULL, FALSE,
                       CREATE_SUSPENDED,
                       NULL, NULL, &si, &pi)) {
      if (GetModuleFileNameEx(pi.hProcess, NULL, szBuf, ARRAYSIZE(szBuf))) {
       wprintf(L"Executable is %ls\n", szBuf);
      } else {
       wprintf(L"Failed to get module file name: %d\n", GetLastError());
      }
      TerminateProcess(pi.hProcess, 0);
      CloseHandle(pi.hProcess);
      CloseHandle(pi.hThread);
     } else {
      wprintf(L"Failed to create process: %d\n", GetLastError());
     }
    
     return 0;
    }
    

    This program prints

    Failed to get module file name: 6
    

    and error 6 is ERROR_INVALID_HANDLE. "How can the process handle be invalid? I just created the process!"

    Oh, the process handle is valid. The handle that isn't valid is the NULL.

    "But the documentation says that NULL is a valid value for the second parameter. It retrieves the path to the executable."

    In Windows, processes are initialized in-process. (In other words, processes are self-initializing.) The Create­Process function creates a process object, sets the initial state of that object, copies some information into the address space of the new process (like the command line parameters), and sets the instruction pointer to the process startup code inside ntdll.dll. From there, the startup code in ntdll.dll pulls the process up by its bootstraps. It creates the default heap. It loads the primary executable and the associated bookkeeping that says "Here is the module information for the primary executable, in case anybody asks." It identifies all the DLLs referenced by the primary executable, the DLLs referenced by those DLLs, and so on. It loads each of the DLLs in turn, creating the module information that says "Here is another module that this process loaded, in case anybody asks," and then it initializes the DLLs in the proper order. Once all the process bootstrapping is complete, ntdll.dll calls the executable entry point, and the program takes control.

    An interesting take-away from this is that modules are a user-mode concept. Kernel mode does not know about modules. All kernel mode sees is that somebody in user mode asked to map sections of a file into memory.

    Okay, so if the process is responsible for managing its modules, how do functions like Get­Module­File­Name­Ex work? They issue a bunch of Read­Process­Memory calls and manually parse the in-memory data structures of another process. Normally, this would be considered "undocumented reliance on internal data structures that can change at any time," and in fact those data structures do change quite often. But it's okay because the people who maintain the module loader (and therefore would be the ones who change the data structures) are also the people who maintain Get­Module­File­Name­Ex (so they know to update the parser to match the new data structures).

    With this background information, let's go back to the original question. Why is Get­Module­File­Name­Ex failing with ERROR_INVALID_HANDLE?

    Observe that the process was created suspended. This means that the process object has been created, the initialization parameters have been injected into the new process's address space, but no code in the process has run yet. In particular, the startup code inside ntdll.dll hasn't run. This means that the code to add a module information entry for the main executable hasn't run.

    Now we can connect the dots. Since the module information entry for the main executable hasn't been added to the module table, the call to Get­Module­File­Name­Ex is going to try to parse the module table from the suspended Notepad process, and it will see that the table is empty. Actually, it's worse than that. The module table hasn't been created yet. The function then reports, "There is no module table entry for NULL," and it tells you that the handle NULL is invalid.

    Functions like Get­Module­File­Name­Ex and Create­Tool­help­32­Snapshot are designed for diagnostic or debugging tools. There are naturally race conditions involved, because the process you are inspecting is certainly free to load or unload a module immediately after the call returns, at which point your information may be out of date. What's worse, the process you are inspecting may be in the middle of updating its module table, in which case the call may simply fail with a strange error like ERROR_PARTIAL_COPY. (Protecting the data structures with a critical section isn't good enough because critical sections do not cross processes, and the process doing the inspecting is going to be using Read­Process­Memory, which doesn't care about critical sections.)

    In the particular example above, the code could avoid the problem by using the Query­Full­Process­Image­Name function to get the path to the executable.

    Bonus chatter: The Create­Tool­help­32­Snapshot function extracts the information in a different way from Get­Module­File­Name­Ex. Rather than trying to parse the information via Read­Process­Memory, it injects a thread into the target process and runs code to extract the information from within the process, and then marshals the results back. I'm not sure whether this is more crazy than using Read­Process­Memory or less crazy.

    Second bonus chatter: A colleague of mine chose to describe this situation more directly. "Let's cut to the heart of the matter. These APIs don't really work by the normally-accepted definitions of 'work'." These snooping-around functions are best-effort, so use them in situations where best-effort is better than nothing. For example, if you have a diagnostic tool, you're probably happy that it gets information at all, even if it may sometimes be incomplete. (Debuggers don't use any of these APIs. Debuggers receive special events to notify them of modules as they are loaded and unloaded, and those notifications are generated by the loader itself, so they are reliable.)

    Exercise: Diagnose this customer's problem: "If we launch a process suspended, the Get­Module­Information function fails with ERROR_INVALID_HANDLE."

    #include <windows.h>
    #include <psapi.h>
    #include <iostream>
    
    int __cdecl wmain(int, wchar_t **)
    {
     STARTUPINFO si = { sizeof(si) };
     PROCESS_INFORMATION pi;
     wchar_t szBuf[MAX_PATH] = L"C:\\Windows\\System32\\notepad.exe";
    
     if (CreateProcess(szBuf, szBuf, NULL, NULL, FALSE,
                       CREATE_SUSPENDED,
                       NULL, NULL, &si, &pi)) {
      DWORD addr;
      std::cin >> std::hex >> addr;
      MODULEINFO mi;
      if (GetModuleInformation(pi.hProcess, (HINSTANCE)addr,
                               &mi, sizeof(mi))) {
       wprintf(L"Got the module information\n");
      } else {
       wprintf(L"Failed to get module information: %d\n", GetLastError());
      }
      TerminateProcess(hProcess, 0);
      CloseHandle(pi.hProcess);
      CloseHandle(pi.hThread);
     } else {
      wprintf(L"Failed to create process: %d\n", GetLastError());
     }
    
     return 0;
    }
    

    Run Process Explorer, then run this program. When the program asks for an address, enter the address that Process Explorer reports for the base address of the module.

  • The Old New Thing

    Why doesn't the Print command appear when I select 20 files and right-click?

    • 20 Comments
    This is explained in the MSDN documentation:

    When the number of items selected does not match the verb selection model or is greater than the default limits outlined in the following table, the verb fails to appear.

    Type of verb implementation Document Player
    Legacy 15 items 100 items
    COM 15 items No limit

    The problem here is that users will select a large number of files, then accidentally Print all of them. This fires up 100 copies of Notepad or Photoshop or whatever, and all of them start racing to the printer, and most of the time, the user is frantically trying to close 100 windows to stop the documents from printing, which is a problem because 100 new processes is putting a heavy load on the system, so it's slow to respond to all the frantic clicks, and even if the click manages to make it to the printing application, the application is running so slowly due to disk I/O contention that it takes a long time for it to respond to the click anyway.

    In panic, the user pulls the plug to the computer.

    The limit of 15 documents for legacy verbs tries to limit the scope of the damage. You will get at most 15 new processes starting at once, which is still a lot, but is significantly more manageable than 100 processes.

    Player verbs and COM-based verbs have higher limits because they are typically all handled by a single program, so there's only one program that you need to close. (Although there is one popular player that still runs a separate process for each media file, so if you select 1000 music files, right-click, and select "Add to playlist", it runs 1000 copies of the program, which basically turns your computer into a space heater. An arbitrary limit of 100 was chosen to keep the damage under control.)

    If you want to raise the 15-document limit, you can adjust the Multiple­Invoke­Prompt­Minimum setting. Note that this setting is not contractual, so don't get too attached to it.

  • The Old New Thing

    Hazy memories of the Windows 95 ship party

    • 19 Comments

    One of the moments from the Windows 95 ship party (20 years ago today) was when one of the team members drove his motorcycle through the halls, leaving burns in the carpet.

    The funny part of that story (beyond the fact that it happened) is that nobody can agree on who it was! I seem to recall that it was Todd, but another of my colleagues remembers that it was Dave, and yet another remembers that it was Ed. We all remember the carpet burns, but we all blame it on different people.

    As one of my colleagues noted, "I'm glad all of this happened before YouTube."

    Brad Silverberg, the vice president of the Personal Systems Division (as it was then known), recalled that "I had a lot of apologizing to do to Facilities [about all the shenanigans that took place that day], but it was worth it."

  • The Old New Thing

    Generating different types of timestamps from quite a long way away

    • 7 Comments

    Today's Little Program does the reverse of what we had last time. It takes a point in time and then generates timestamps in various formats.

    using System;
    
    class Program
    {
     static void TryFormat(string format, Func<long> func)
     {
      try {
       long l = func();
       if ((ulong)l > 0x00000000FFFFFFFF) {
           Console.WriteLine("{0} 0x{1:X16}", format, l);
       } else {
           Console.WriteLine("{0} 0x{1:X08}", format, l);
       }
      } catch (ArgumentException) {
       Console.WriteLine("{0} - invalid", format);
      }
     }
    

    Like last time, the Try­Format method executes the passed-in function inside a try/catch block. If the function executes successfully, then we print the result. There is a tiny bit of cleverness where we choose the output format depending on the number of bits in the result.

     static long DosDateTimeFromDateTime(DateTime value)
     {
      int result = ((value.Year - 1980) << 25) |
                   (value.Month << 21) |
                   (value.Day << 16) |
                   (value.Hour << 11) |
                   (value.Minute << 5) |
                   (value.Second >> 1);
      return (uint)result;
     }
    

    The Dos­Date­Time­From­Date­Time converts the Date­Time into a 32-bit date/time stamp in MS-DOS format. This is not quite correct because MS-DOS format date/time stamps are in local time, but we are not converting the incoming Date­Time to local time. It's up to you to understand what's going on.

     public static void Main(string[] args)
     {
      int[] parts = new int[7];
      for (int i = 0; i < 7; i++) {
       parts[i] = args.Length > i ? int.Parse(args[i]) : 0;
      }
    
      DateTime value = new DateTime(parts[0], parts[1], parts[2],
                                    parts[3], parts[4], parts[5],
                                    parts[6], DateTimeKind.Utc);
    
      Console.WriteLine("Timestamp {0} UTC", value);
    
      TryFormat("Unix time",
        () => value.ToFileTimeUtc() / 10000000 - 11644473600);
      TryFormat("UTC FILETIME",
        () => value.ToFileTimeUtc());
      TryFormat("Binary DateTime",
        () => value.ToBinary());
      TryFormat("MS-DOS Date/Time",
        () => DosDateTimeFromDateTime(value));
      TryFormat("OLE Date/Time",
        () => BitConverter.DoubleToInt64Bits(value.ToOADate()));
     }
    }
    

    The parameters on the command line are the year, month, day, hour, minute, second, and millisecond; any omitted parameters are taken as zero. We create a UTC Date­Time from it, and then try to convert that Date­Time into the other formats.

    [Raymond is currently away; this message was pre-recorded.]

  • The Old New Thing

    On the various ways of creating large files in NTFS

    • 12 Comments

    For whatever reason, you may want to create a large file.

    The most basic way of doing this is to use Set­File­Pointer to move the pointer to a large position into the file (that doesn't exist yet), then use Set­End­Of­File to extend the file to that size. This file has disk space assigned to it, but NTFS doesn't actually fill the bytes with zero yet. It will do that lazily on demand. If you intend to write to the file sequentially, then that lazy extension will not typically be noticeable because it can be combined with the normal writing process (and possibly even optimized out). On the other hand, if you jump ahead and write to a point far past the previous high water mark, you may find that your single-byte write lasts forever.

    Another option is to make the file sparse. I refer you to the remarks I made some time ago on the pros and cons of this technique. One thing to note is that when a file is sparse, the virtual-zero parts do not have physical disk space assigned to them. Consequently, it's possible for a Write­File into a previously virtual-zero section of the file may fail with an ERROR_DISK_QUOTA_EXCEEDED error.

    Yet another option is to use the Set­File­Valid­Data function. This tells NTFS to go grab some physical disk space, assign it to the file, and to set the "I already zero-initialized all the bytes up to this point" value to the file size. This means that the bytes in the file will contain uninitialized garbage, and it also poses a security risk, because somebody can stumble across data that used to belong to another user. That's why Set­File­Valid­Data requires administrator privileges.

    From the command line, you can use the fsutil file setvaliddata command to accomplish the same thing.

    Bonus chatter: The documentation for Set­End­Of­File says, "If the file is extended, the contents of the file between the old end of the file and the new end of the file are not defined." But I just said that it will be filled with zero on demand. Who is right?

    The formal definition of the Set­End­Of­File function is that the extended content is undefined. However, NTFS will ensure that you never see anybody else's leftover data, for security reasons. (Assuming you're not intentionally bypassing the security by using Set­File­Valid­Data.)

    Other file systems, however, may choose to behave differently.

    For example, in Windows 95, the extended content is not zeroed out. You will get random uninitialized junk that happens to be whatever was lying around on the disk at the time.

    If you know that the file system you are using is being hosted on a system running some version of Windows NT (and that the authors of the file system passed their Common Criteria security review), then you can assume that the extra bytes are zero. But if there's a chance that the file is on a computer running Windows for Workgroups or Windows 95, then you need to worry about those extra bytes. (And if the file system is hosted on a computer running a non-Windows operating system, then you'll have to check the documentation for that operating system to see whether it guarantees zeroes when files are extended.)

    [Raymond is currently away; this message was pre-recorded.]

  • The Old New Thing

    Why is my x64 process getting heap address above 4GB on Windows 8?

    • 43 Comments

    A customer noticed that when they ran their program on Windows 8, memory allocations were being returned above the 4GB boundary. They included a simple test program:

    #include <stdio.h>
    #include <stdlib.h>
    
    int main(int argc, char** argv)
    {
        void *testbuffer = malloc(256);
        printf("Allocated address = %p\n", testbuffer);
        return 0;
    }
    

    When run on Windows 7, the function prints addresses like 0000000000179B00, but on Windows 8, it prints addresses like 00000086E60EA410.

    The customer added that they care about this difference because pointers above 4GB will be corrupted when the value is truncated to a 32-bit value. As part of their experimentation, they found that they could force pointers above 4GB to occur even on Windows 7 by allocating very large chunks of memory, but on Windows 8, it's happening right off the bat.

    The memory management team explained that this is expected for applications linked with the /HIGH­ENTROPY­VA flag, which the Visual Studio linker enables by default for 64-bit programs.

    High-entropy virtual address space is more commonly known as Address Space Layout Randomization (ASLR). ASLR is a feature that makes addresses in your program less predictable, which significantly improves its resiliance to many categories of security attacks. Windows 8 expands the scope of ASLR beyond just the code pages in your process so that it also randomizes where the heap goes.

    The customer accepted that answer, and that was the end of the conversation, but there was something in this exchange that bothered me: The bit about truncating to a 32-bit value.

    Why are they truncating 64-bit pointers to 32-bit values? That's the bug right there. And they even admit that they can trigger the bug by forcing the program to allocate a lot of memory. They need to stop truncating pointers! Once they do that, all the problems will go away, and it won't matter where the memory gets allocated.

    If there is some fundamental reason that they have to truncate pointers to 32-bit values, then they should build without /LARGEADDRESSAWARE so that the process will be given an address space of only 2GB, and then they can truncate their pointers all they want.

    (Of course, if you're going to do that, then you probably should just compile the program as a 32-bit program, since you're not really gaining much from being a 64-bit program any more.)

Page 2 of 455 (4,549 items) 12345»