March, 2013

  • The Old New Thing

    2013 Q1 link clearance: Microsoft blogger edition

    • 12 Comments

    It's that time again: Linking to other Microsoft bloggers.

  • The Old New Thing

    How do I convert a method name to a method index for the purpose of INTERFACEINFO?

    • 7 Comments

    The IMessage­Filter::Handle­Incoming­Call method describes the incoming call by means of an INTERFACE­INFO structure:

    typedef struct tagINTERFACEINFO { 
      LPUNKNOWN pUnk; 
      IID iid; 
      WORD wMethod; 
    } INTERFACEINFO, *LPINTERFACEINFO;
    

    The wMethod is a zero-based index of the method within the interface. For example, IUnknown::Query­Interface has index zero, IUnknown::Add­Ref has index one, and IUnknown::Release has index two.

    If you want to filter on a method in an interface, you need to know its index. One way of doing this would be to sit and count the methods, but this is error-prone, especially if the interface is still under active development and is not yet set in stone.

    C to the rescue.

    The IDL compiler spits out a C-compatible structure for the virtual function table, and you can use that structure to derive the method indices. For example:

    #if defined(__cplusplus) && !defined(CINTERFACE)
        ...
    #else   /* C style interface */
        typedef struct IPersistStreamVtbl
        {
            BEGIN_INTERFACE
    
            HRESULT ( STDMETHODCALLTYPE *QueryInterface )(
                __RPC__in IPersistStream * This,
                /* [in] */ __RPC__in REFIID riid,
                /* [annotation][iid_is][out] */
                _COM_Outptr_  void **ppvObject);
    
            ULONG ( STDMETHODCALLTYPE *AddRef )(
                __RPC__in IPersistStream * This);
    
            ULONG ( STDMETHODCALLTYPE *Release )(
                __RPC__in IPersistStream * This);
    
            HRESULT ( STDMETHODCALLTYPE *GetClassID )(
                __RPC__in IPersistStream * This,
                /* [out] */ __RPC__out CLSID *pClassID);
    
            HRESULT ( STDMETHODCALLTYPE *IsDirty )(
                __RPC__in IPersistStream * This);
    
            HRESULT ( STDMETHODCALLTYPE *Load )(
                __RPC__in IPersistStream * This,
                /* [unique][in] */ __RPC__in_opt IStream *pStm);
    
            HRESULT ( STDMETHODCALLTYPE *Save )(
                __RPC__in IPersistStream * This,
                /* [unique][in] */ __RPC__in_opt IStream *pStm,
                /* [in] */ BOOL fClearDirty);
    
            HRESULT ( STDMETHODCALLTYPE *GetSizeMax )(
                __RPC__in IPersistStream * This,
                /* [out] */ __RPC__out ULARGE_INTEGER *pcbSize);
    
            END_INTERFACE
        } IPersistStreamVtbl;
        ...
    #endif  /* C style interface */
    

    (You get roughly the same thing if you use the DECLARE_INTERFACE macros.)

    After we remove the distrauctions, the structure is just

        typedef struct IPersistStreamVtbl
        {
            BEGIN_INTERFACE
            HRESULT (*QueryInterface)(...);
            ULONG (*AddRef)(...);
            ULONG (*Release)(...);
            HRESULT (*GetClassID)(...);
            HRESULT (*IsDirty)(...);
            HRESULT (*Load)(...);
            HRESULT (*Save)(...);
            HRESULT (*GetSizeMax)(...);
            END_INTERFACE
        } IPersistStreamVtbl;
    

    From this, we can write a macro which extracts the method index:

    // If your compiler supports offsetof, then you can use that
    // instead of FIELD_OFFSET.
    #define METHOD_OFFSET(itf, method) FIELD_OFFSET(itf##Vtbl, method)
    
    #define METHOD_INDEX(itf, method) \
        ((METHOD_OFFSET(itf, method) - \
          METHOD_OFFSET(itf, QueryInterface)) / sizeof(FARPROC))
    

    The macro works by looking at the position of the method in the vtable and calculating its index relative to Query­Interface, which we know has index zero for all IUnknown-derived COM interfaces.

    These macros assume that the size of a pointer-to-function is the same regardless of the prototype, but this assumption is safe to make because it is required by the COM ABI.

    Observe that in order to get the C-style interfaces, you must define the CINTERFACE macro before including the header file. (And observe that the C-style interfaces are not available in C++; you must do this in C.)

    If the bulk of your program is in C++, you can slip in a single C file to extract the method indices and expose them to the C++ side either through global variables or short functions. Depending on how fancy your link-time code generator is, the global variable or function call might even become eliminated.

  • The Old New Thing

    The C language specification describes an abstract computer, not a real one

    • 21 Comments

    If a null pointer is zero, how do you access the memory whose address is zero? And if C allows you to take the address one past the end of an array, how do you make an array that ends at 0xFFFFFFFF, since adding one to that value would wrap around?

    First of all, who says that there is a byte zero? Or a byte 0xFFFFFFFF?

    The C language does not describe an actual computer. It describes a theoretical one. On this theoretical computer, it must be possible to do certain things, like generate the address of one item past the end of an array, and that address must compare greater than the address of any member of the array.

    But how the C language implementation chooses to map these theoretical operations to actual operations is at the discretion of the C language implementation.

    Now, most implementations will do the "obvious" thing and say, "Well, a pointer is represented as a numerical value which is equal to the low-level memory address." But they are not required to do so. For example, you might have an implementation that says, "You know what? I'm just going to mess with you, and every pointer is represented as a numerical value which is equal to the low-level memory address minus 4194304. In other words, if you try to dereference a pointer whose numeric value is 4096, you actually access the memory at 4194304 + 4096 = 4198400. On such a system, you could have an array that goes all the way to 0xFFFFFFFF, because the numeric value of the pointer to that address is 0xFFBFFFFF, and the pointer to one past the end of the array is therefore a perfectly happy 0xFFC00000.

    Before you scoff and say "That's a stupid example because nobody would actually do that," think again. Win32s did exactly this. (The 4194304-byte offset was done in hardware by manipulating the base address of the flat selectors.) This technique was important because byte 0 was the start of the MS-DOS interrupt table, and corrupting that memory was a sure way to mess up your system pretty bad. By shifting all the pointers, it meant that a Win32s program which dereferenced a null pointer ended up accessing byte 4194304 rather than byte 0, and Win32s made sure that there was no memory mapped there, so that the program took an access violation rather than corrupting your system.

    But let's set aside implementations which play games with pointer representations and limit ourselves to implementations which map pointers to memory addresses directly.

    "A 32-bit processor allegedly can access up to 2³² memory locations. But if zero and 0xFFFFFFFF can't be used, then shouldn't we say that a 32-bit processor can access only 2³² − 2 memory locations? Is everybody getting ripped off by two bytes? (And if so, then who is pocketing all those lost bytes?)"

    A 32-bit processor can address 2³² memory locations. There are no "off-limits" addresses from the processor's point of view. The guy that made addresses zero and 0xFFFFFFFF off-limits was the C language specification, not the processor. That a language fails to expose the full capabilities of the underlying processor shouldn't be a surprise. For example, you probably would have difficulty accessing the byte at 0xFFFFFFFF from JavaScript.

    There is no rule in the C language specification that the language must permit you to access any byte of memory in the computer. Implementations typically leave certain portions of the address space intentionally unused so that they have wiggle room to do the things the C language specification requires them to do. For example, the implementation can arrange never to allocate an object at address zero, so that it can conform to the requirement that the address of an object never compares equal to the null pointer. It also can arrange never to allocate an object that goes all the way to 0xFFFFFFFF, so that it can safely generate a pointer one past the end of the object which behaves as required with respect to comparison.

    So you're not getting ripped off. Those bytes are still addressable in general. But you cannot get to them in C without leaving the C abstract machine.

    A related assertion turns this argument around. "It is impossible to write a conforming C compiler for MS-DOS because the C language demands that the address of a valid object cannot be zero, but in MS-DOS, the interrupt table has address zero."

    There is a step missing from this logical argument: It assumes that the interrupt table is a C object. But there is no requirement that the C language provide access to the interrupt table. (Indeed, there is no mention of the interrupt table anywhere in the C language specification.) All a conforming implementation needs to do is say, "The interrupt table is not part of the standard-conforming portion of this implementation."

    "Aha, so you admit that a conforming implementation cannot provide access to the interrupt table."

    Well, certainly a conforming implementation can provide language extensions which permit access to the interrupt table. It may even decide that dereferencing a null pointer grants you access to the interrupt table. This is permitted because dereferencing a null pointer invokes undefined behavior, and one legal interpretation of undefined behavior is "grants access to the interrupt table."

  • The Old New Thing

    "Adjust visual effects for best performance" should really be called "Adjust visual effects for crappiest appearance"

    • 59 Comments

    In the Performance Options control panel, on the tab labeled Visual Effects, there is a radio button called Adjust for best performance. If you select it, then all the visual effects are disabled.

    But the name of that radio button has been wrong for a long time. It doesn't actually adjust your visual effects for best performance. It just adjusts them for crappiest appearance.

    Starting in Windows Vista, a lot of visual effects were offloaded to the graphics card. Consequently, the impact on system performance for those visual effects is negligible, and sometimes turning off the effect actually makes your system run slower because you disabled hardware acceleration, forcing operations to be performed in software.

    For example, if desktop composition is enabled, then a backup copy of a window's entire contents is kept in video memory, even if the window is covered by other windows. Without desktop composition, the window manager uses the classic model which follows the principle don't save anything you can recalculate: The contents of an occluded window are not saved anywhere, and when the window becomes exposed, the window receives a WM_PAINT message to tell it to regenerate its contents.

    This means that, for example, when you remove a window from the screen and expose the window underneath, the desktop compositor can show the contents of the underlying window immediately because it saved a copy of the window in video memory and has been keeping it up to date. On the other hand, if you disable desktop composition, you will just stare at a blank window underneath, and then you have to sit and wait for that window to repaint itself.

    Congratulations: By disabling desktop composition, you made the act of uncovering a window run slower. (You will see the same effect when switching between maximized windows.)

    Okay, so if unchecking these visual effects has negligible and sometimes even a negative effect on performance, why do we still present them in the Visual Effects control panel for people to disable?

    Because enthusiasts who think they are so awesomely clever looooove disabling anything that makes the computer look pretty, because they're convinced that effort making the computer look pretty is clearly done at the expense of performance. They observe that if all the visual effects are disabled, their machine runs fast, but that's not a controlled experiment because they failed to measure how fast the computer runs when the effects are enabled. (By similar logic, sticking a banana in your ear keeps the alligators away.)

    These are the people who think that a bare computer motherboard on a table with components hanging off it runs faster than a computer packed into an attractive case. And even if you demonstrate that it doesn't actually run faster, they will still keep their computer in its partially-disassembled state because it adds to their street cred.

    The Visual Effects settings page turned into a sort of unintended psychology experiment.

  • The Old New Thing

    There's no law that says two people can't have the same thing to eat

    • 37 Comments

    Some time ago, my group went out for a team lunch. It was to a restaurant we were not familiar with, so there was quite a bit of time studying the menu. As everybody looked over the menu, discussion naturally turned to "So what are you going to have?"

    "I think I'll have the salmon sandwich."

    One of my colleagues replied, "Oh, rats. I was thinking of having that."

    I remarked, "There's no law that says two people can't order the same thing."

    My colleague disagreed.

    Not if you ask my wife. Whenever we go out to eat, she'll ask me what I'm having, and then she'll say "Oh, rats. I was thinking of having that. Now I'll have to order something else."

    I'll say, "You can order it too, that's okay. Or I'll change my order, no big deal."

    But she'll say, "No, that's okay. I'll just find something else."

    I've tried many times without success to convince her that it's okay for two people to have the same thing to eat. Now I just accept it.

    Update: A few months later, I received an update from my colleague.

    The other night, my wife and I went out to dinner, and my wife really wanted the same that I had already said that I was going to order. But instead of switching to something else, she ordered it anyway. I think this is the first time this has ever happened. And you know what? The world did not end.
  • The Old New Thing

    Dreaming about a rather unusual guitar rehearsal

    • 1 Comments

    I dreamed that I watched a long-time colleague of mine rehearse the guitar in preparation for the new "hot pants" competition of the Miss Universe pageant.

    The scary thing is that the pageant may actually do it.

  • The Old New Thing

    Using accessibility to monitor windows as they come and go

    • 9 Comments
    Today's Little Program monitors windows as they come and go. When people contemplate doing this, they come up with ideas like installing a WH_CBT hook or a WH_SHELL hook, but one of the major problems with those types of hooks is that they are injected hooks. Injection is bad for a number of reasons.

    • It forces the hook to be in a DLL so it can be injected.
    • Hook activities need to be marshaled back to the main program.
    • Your DLL will capture events only in processes of the same bitness, because you cannot load a 32-bit DLL into a 64-bit process or vice versa.
    • You can inject into an elevated process only if your process is also elevated. If your process is non-elevated, then you will not capture events for windows belonging to elevated processes.

    This is where accessibility comes in handy, because accessibility lets you specify whether you want your hook to be an injected or non-injected one. And if you're non-injected, then the programming model is much simpler because everything happens in your process (indeed, on a single thread).

    Take the scratch program and make the following changes:

    #include <strsafe.h>
    
    BOOL
    OnCreate(HWND hwnd, LPCREATESTRUCT lpcs)
    {
     g_hwndChild = CreateWindow(TEXT("listbox"), NULL,
         LBS_HASSTRINGS | WS_CHILD | WS_VISIBLE | WS_VSCROLL,
         0, 0, 0, 0, hwnd, NULL, g_hinst, 0);
     if (!g_hwndChild) return FALSE;
     return TRUE;
    }
    
    void CALLBACK WinEventProc(
        HWINEVENTHOOK hWinEventHook,
        DWORD event,
        HWND hwnd,
        LONG idObject,
        LONG idChild,
        DWORD dwEventThread,
        DWORD dwmsEventTime
    )
    {
     if (hwnd &&
         idObject == OBJID_WINDOW &&
         idChild == CHILDID_SELF)
     {
      PCTSTR pszAction = NULL;
      TCHAR szBuf[80];
      switch (event) {
      case EVENT_OBJECT_CREATE:
       pszAction = TEXT("created");
       break;
      case EVENT_OBJECT_DESTROY:
       pszAction = TEXT("destroyed");
       break;
      }
      if (pszAction) {
       TCHAR szClass[80];
       TCHAR szName[80];
       szClass[0] = TEXT('\0');
       szName[0] = TEXT('\0');
       if (IsWindow(hwnd)) {
        GetClassName(hwnd, szClass, ARRAYSIZE(szClass));
        GetWindowText(hwnd, szName, ARRAYSIZE(szName));
       }
       TCHAR szBuf[80];
       StringCchPrintf(szBuf, ARRAYSIZE(szBuf),
                       TEXT("%p %s \"%s\" (%s)"), hwnd, pszAction,
                       szName, szClass);
       ListBox_AddString(g_hwndChild, szBuf);
      }
     }
    }
    
    int WINAPI WinMain(HINSTANCE hinst, HINSTANCE hinstPrev,
                       LPSTR lpCmdLine, int nShowCmd)
    {
     ...
      ShowWindow(hwnd, nShowCmd);
    
     HWINEVENTHOOK hWinEventHook = SetWinEventHook(
         EVENT_OBJECT_CREATE, EVENT_OBJECT_DESTROY,
         NULL, WinEventProc, 0, 0,
         WINEVENT_OUTOFCONTEXT | WINEVENT_SKIPOWNPROCESS);
    
      while (GetMessage(&msg, NULL, 0, 0)) {
       TranslateMessage(&msg);
       DispatchMessage(&msg);
      }
    
      if (hWinEventHook) UnhookWinEvent(hWinEventHook);
    ...
    }
    

    This is a generalization of our earlier program which waits for a specific window to be destroyed, except that we now are watching all windows for creation and destruction.

    When you run this program, you see that there is a lot of window activity, but maybe you are interested only in windows when they are shown and hidden. No problem, that's a small change:

      switch (event) {
      case EVENT_OBJECT_SHOW:
       pszAction = TEXT("shown");
       break;
      case EVENT_OBJECT_HIDE:
       pszAction = TEXT("hidden");
       break;
      }
    ...
    
     HWINEVENTHOOK hWinEventHook = SetWinEventHook(
         EVENT_OBJECT_SHOW, EVENT_OBJECT_HIDE,
         NULL, WinEventProc, 0, 0,
         WINEVENT_OUTOFCONTEXT | WINEVENT_SKIPOWNPROCESS);
    

    Notice that these notifications are received for windows from both 32-bit and 64-bit processes, and that they are received even for windows belonging to elevated processes. You can't do that with an injected hook.

  • The Old New Thing

    When will GetMessage return -1?

    • 16 Comments

    A source of great consternation is the mysterious -1 return value from Get­Message:

    If there is an error, the return value is −1. For example, the function fails if hWnd is an invalid window handle or lpMsg is an invalid pointer.

    That paragraph has caused all sorts of havoc, because it throws into disarray the standard message pump:

    MSG msg;
    while (GetMessage(&msg, NULL, 0, 0)) {
     ...
    }
    

    But don't worry, the standard message pump is safe. If your parameters are exactly

    • a valid pointer to a valid MSG structure,
    • a null window handle,
    • no starting message range filter,
    • no ending message range filter,

    then Get­Message will not fail with -1.

    Originally, the Get­Message function did not have a failure mode. If you passed invalid parameters, then you invoked undefined behavior, and you probably crashed.

    Later, somebody said, "Oh, no, the Get­Message function needs to detect invalid parameters and instead of crashing, it needs to fail gracefull with some sort of error code." (This was before "Fail-Fast" came into fashion.)

    The problem is that Get­Message's return value of BOOL was already specified not as a success/failure code, but rather a "Has a WM_QUIT message been received?" code. So return FALSE wouldn't work.

    The solution (if that's what you want to call it) was to have Get­Message return the not-really-a-BOOL-but-we'll-pretend-it-is value -1 to signal an invalid parameter error.

    And that's what threw everybody into a tizzy, because now every message loop looks buggy.

    But you can calm down. The standard message loop is fine. All the parameters are hard-coded (and therefore valid by inspection), save for the &msg parameter, which is still valid by inspection. So that case is okay. It has to be, for compatibility.

    The people who need to worry are people who pass a variable as the window handle filter (because that window handle may no longer be valid), or pass dynamically-allocated memory as the lpMsg (because the pointer may no longer be valid), or who pass a nontrivial message filter (because the filter parameters may be invalid).

    In practice, the memory for the lpMsg is nearly always a stack variable (so the pointer is valid), and the message range filters are hard-coded (so valid by inspection). The one to watch out for is the window handle filter. But we saw earlier that a filtered Get­Message is a bad idea anyway, because your program will not respond to messages that don't meet the filter.

  • The Old New Thing

    Does this operation work when file system redirection is disabled? The default answer is NO

    • 23 Comments

    A customer reported that when their program called SH­Get­File­Info to get the icon for a folder, the call failed. "It works on some machines but not others. We don't know what the difference is between the working and non-working machines." They included the offending function from their program, but everything in the function looked good. The problem was something outside the function itself.

    Eventually, the customer confessed that they had called the Wow64­Disable­Wow64­Fs­Redi­rection function to disable file system redirection, and the call to SH­Get­File­Info took place while redirection was disabled. "We found that if we re-enable file system redirection before calling SH­Get­File­Info, then everything works properly."

    That's right, because, like impersonation, nothing works when file system redirection is disabled unless it is specifically documented as supporting disabled redirection. This is even called out in the documentation for Wow64­Disable­Wow64­Fs­Redi­rection:

    Note  The Wow64­Disable­Wow64­Fs­Redi­rection function affects all file operations performed by the current thread, which can have unintended consequences if file system redirection is disabled for any length of time. For example, DLL loading depends on file system redirection, so disabling file system redirection will cause DLL loading to fail. Also, many feature implementations use delayed loading and will fail while redirection is disabled. The failure state of the initial delay-load operation is persisted, so any subsequent use of the delay-load function will fail even after file system redirection is re-enabled. To avoid these problems, disable file system redirection immediately before calls to specific file I/O functions (such as Create­File) that must not be redirected, and re-enable file system redirection immediately afterward using Wow64­Revert­Wow64­Fs­Redi­rection.

    Whenever you use one of these "global solutions to a local problem" types of solutions that change some fundamental behavior of the system, you have to make sure that everybody is on board with your decision.

    The local solution would be to use the C:\Windows\Sys­Native virtual directory for files you want to look up in the native system directory rather than the emulated system directory.

  • The Old New Thing

    The x86 architecture is the weirdo: Structured exception handling

    • 22 Comments

    If your reference architecture is x86, then you will think that everything it does is normal and the rest of the world is weird. Except it's the other way around: The x86 architecture is the weirdo.

    I was reminded of this when commenter 640k complained, on the subject of what I think is table-based structured exception handling, "It would be interesting to know why this 'invention' was introduced in 64-bit Windows when no other version of Windows requires it." (The original text was "when no other OS requires it", but I'm assuming that this was in the context of Windows-based OS, since unix doesn't have structured exception handling in the first place.)

    This has a very narrow definition of "no other OS", because it really means "No other non-x86-based version of Windows." In this world, the color of the sky is x86.

    In fact, x86 is the only architecture for which Windows uses stack-based exception chaining. All other architectures use table-based exception unwinding. The prologue and epilogue of each function must follow a particular format so that the actions performed therein can be unwound during exception handling. At the very introduction of Win32, it was only the x86 which used stack-based unwinding. The Alpha AXP, MIPS, and PowerPC all used used table-based exception unwinding. And as new architectures were added by Windows, they all used table-based exception unwinding as well. Itanium? Table-based. Alpha AXP 64-bit? Table-based. ARM? Table-based.

    The use of table-based exception handling was not "introduced" with x64. It was introduced back in 1992, and has in fact been the exception unwinding mechanism for all architectures.

    Well, almost all. Not the x86, because the x86 is the weirdo.

Page 1 of 3 (30 items) 123