• The Old New Thing

    Adjustor thunks

    • 15 Comments

    Yesterday we learned about the layout of COM objects and I hinted at "adjustor thunks".

    If you find yourself debugging in disassembly, you'll sometimes find strange little functions called "adjustor thunks". Let's take another look at the object we laid out last time:

    class CSample : public IPersist, public IServiceProvider
    {
    public:
      // *** IUnknown ***
      STDMETHODIMP QueryInterface(REFIID riid, void** ppv);
      STDMETHODIMP_(ULONG) AddRef();
      STDMETHODIMP_(ULONG) Release();
    
      // *** IPersist ***
      STDMETHODIMP GetClassID(CLSID* pClassID);
    
      // *** IQueryService ***
      STDMETHODIMP QueryService(REFGUID guidService,
                      REFIID riid, void** ppv);
    private:
      LONG m_cRef;
      ...
    };
    
    p    lpVtbl    QueryInterface (1)
    q    lpVtbl    QueryInterface (2) AddRef (1)
    m_cRef AddRef (2) Release (1)
    ... Release (2) GetClassID (1)
    QueryService (2)

    In the diagram, p is the pointer returned when the IPersist interface is needed, and q is the pointer for the IQueryService interface.

    Now, there is only one QueryInterface method, but there are two entries, one for each vtable. Remember that each function in a vtable receives the corresponding interface pointer as its "this" parameter. That's just fine for QueryInterface (1); its interface pointer is the same as the object's interface pointer. But that's bad news for QueryInterface (2), since its interface pointer is q, not p.

    This is where the adjustor thunks come in.

    The entry for QueryInterface (2) is a stub function that changes q to p, and then lets QueryInterface (1) do the rest of the work. This stub function is the adjustor thunk.

    [thunk]:CSample::QueryInterface`adjustor{4}':
      sub     DWORD PTR [esp+4], 4 ; this -= sizeof(lpVtbl)
      jmp     CSample::QueryInterface
    

    The adjustor thunk takes the "this" pointer and subtracts 4, converting q into p, then it jumps to the QueryInterface (1) function to do the real work.

    Whenever you have multiple inheritance and a virtual function is implemented on multiple base classes, you will get an adjustor thunk for the second and subsequent base class methods in order to convert the "this" pointer into a common format.

  • The Old New Thing

    The layout of a COM object

    • 39 Comments

    The Win32 COM calling convention specifies the layout of the virtual method table (vtable) of an object. If a language/compiler wants to support COM, it must lay out its object in the specified manner so other components can use it.

    It is no coincidence that the Win32 COM object layout matches closely the C++ object layout. Even though COM was originally developed when C was the predominant programming language, the designers saw fit to "play friendly" with the up-and-coming new language C++.

    The layout of a COM object is made explicit in the header files for the various interfaces. For example, here's IPersist from objidl.h, after cleaning up some macros.

    typedef struct IPersistVtbl
    {
        HRESULT ( STDMETHODCALLTYPE *QueryInterface )(
            IPersist * This,
            /* [in] */ REFIID riid,
            /* [iid_is][out] */ void **ppvObject);
    
        ULONG ( STDMETHODCALLTYPE *AddRef )(
            IPersist * This);
    
        ULONG ( STDMETHODCALLTYPE *Release )(
            IPersist * This);
    
        HRESULT ( STDMETHODCALLTYPE *GetClassID )(
            IPersist * This,
            /* [out] */ CLSID *pClassID);
    
    } IPersistVtbl;
    
    struct IPersist
    {
        const struct IPersistVtbl *lpVtbl;
    };
    

    This corresponds to the following memory layout:

    p    lpVtbl    QueryInterface
    AddRef
    Release
    GetClassID

    What does this mean?

    A COM interface pointer is a pointer to a structure that consists of just a vtable. The vtable is a structure that contains a bunch of function pointers. Each function in the list takes that interface pointer (p) as its first parameter ("this").

    The magic to all this is that since your function gets p as its first parameter, you can "hang" additional stuff onto that vtable:

    p    lpVtbl    QueryInterface
    ...
    other stuff
    ...
    AddRef
    Release
    GetClassID

    The functions in the vtable can use offsets relative to the interface pointer to access its other stuff.

    If an object implements multiple interfaces but they are all descendants of each other, then a single vtable can be used for all of them. For example, the object above is already set to be used either as an IUnknown or as an IPersist, since IUnknown is a subset of IPersist.

    On the other hand, if an object implements multiple interfaces that are not descendants of each other, then you get multiple inheritance, in which case the object is typically laid out in memory like this:

    p    lpVtbl    QueryInterface (1)
    q    lpVtbl    QueryInterface (2) AddRef (1)
    ...
    other stuff
    ...
    AddRef (2) Release (1)
    Release (2) ...
    ...

    If you are using an interface that comes from the first vtable, then the interface pointer is p. But if you're using an interface that comes from the second vtable, then the interface pointer is q.

    Hang onto that diagram, because tomorrow we will learn about those mysterious "adjustor thunks".

  • The Old New Thing

    Answers to exercises - mismatching new/delete

    • 13 Comments
    Answers to yesterday's exercises:

    What happens if you allocate with scalar "new" and free with vector "delete[]"?

    The scalar "new" will allocate a single object with no hidden counter. The vector "delete[]" will look for the hidden counter, which isn't there, so it will either crash (accessing nonexistent memory) or grab a random number and attempt to destruct that many items. If the random number is greater than one, you will start corrupting memory after the object. If the random number is zero, you fail to destruct anything. If the random number is exactly one, then the one object is destructed.

    Next, the vector "delete[]" will attempt to free the memory block starting one size_t in front of the actual memory block. Depending on how the heap feels today, this may be detected as an invalid parameter and ignored, or this can result in heap corruption.

    Final result: not good.

    What happens if you allocate with vector "new[]" and free with scalar "delete"?

    The vector "new[]" allocates several objects and stores the "howmany" in the hidden counter. The scalar "delete" destructs the first object in the vector. If it was a vector of zero objects, you corrupted memory. If it was a vector of two or more objects, then objects 2 an onward will not be destructed. (Result: Memory or other leak.)

    Next, the scalar "delete" will free the memory block directly, which will fail because the memory block actually starts at the hidden size_t in front of the vector. This again corrupts the heap since you are freeing memory that is not a valid heap pointer.

    Final result: also not good.

    What optimizations can be performed if the destructor MyClass::~MyClass() is removed from the class definition?

    If the class does not have a destructor, then no special work needs to be done when the vector is freed aside from freeing the memory. In this case, no hidden counter is necessary; the block can be allocated directly with no overhead and freed with no overhead.

    More specifically, if the class has a trivial destructor (none of its base classes or sub-objects - if any - have a destructor), then the scalar and vector new/delete allocate and free the memory the same way, and mixing them does not generate a runtime error. You got lucky.

    Of course, somebody might add a destructor to your class tomorrow, and then you won't be so lucky any more.

    Note of course that all of this discussion assumes compiler behavior as described yesterday. That behavior is implementation-dependent so you should not rely on it. You may be lucky today, but the next version of the compiler may change the way it manages vectors and your luck will have run out.

  • The Old New Thing

    The Glass Engine and Ishkur's Guide to Electronic Music

    • 4 Comments
    The Glass Engine is an interactive guide to the music of Philip Glass, organized by... um... at least they're organized. By something.

    Bizarre yet oddly compelling.

    (Perhaps if we ask nicely, we can get Marc Miller to tell the story of the time he actually met Philip Glass...)

    In a similar vein, a friend of mine directed me to Ishkur's Guide to Electronic Music, an attitude-filled tour of the world of of electronic music.

    I'd like to say that I learned something, but that would be overstating it.
  • The Old New Thing

    Mismatching scalar and vector new and delete

    • 16 Comments
    In a previous entry I alluded to the problems that can occur if you mismatch scalar "new" with vector "delete[]" or vice versa.

    There is a nice description of C++ memory management in C++ Gotchas: Avoiding Common Problems in Coding and Design on www.informit.com, and I encourage you to read at least the section titled Failure to Distinguish Scalar and Array Allocation before continuing with this entry, because I'm going to use that information as a starting point.

    Here's how the Microsoft C++ compiler manages vector allocation. Note that this is internal implementation detail, so it's subject to change at any time, but knowing this may give a better insight into why mixing scalar and vector new/delete is a bad thing.

    The important thing to note is that when you do a scalar "delete p", you are telling the compiler, "p points to a single object." The compiler will call the destructor once, namely for the object you are destructing.

    When you do "delete[] p", you are saying, "p points to a bunch of objects, but I'm not telling you how many." In this case, the compiler needs to generate extra code to keep track of how many it needs to destruct. This extra information is kept in a "secret place" when the vector is allocated with "new[]".

    Let's watch this in action:

    
    class MyClass {
    public:
      MyClass(); // constructor
      ~MyClass(); // destructor
      int i;
    };
    
    MyClass *allocate_stuff(int howmany)
    {
        return new MyClass[howmany];
    }
    

    The generated code for the "allocate_stuff" function looks like this:

        push    esi
        mov     esi, [esp+8] ; howmany
       ;  eax = howmany * sizeof(MyClass) + sizeof(size_t)
        lea     eax, [esi*4+4]
        push    eax
        call    operator new
        test    eax, eax
        pop     ecx
        je      fail
        push    edi
        push    OFFSET MyClass::MyClass
        push    esi
        lea     edi, [eax+4] ; edi = eax + sizeof(size_t)
        push    4 ; sizeof(MyClass)
        push    edi
        mov     [eax], esi ; howmany
        call    `vector constructor iterator'
        mov     eax, edi
        pop     edi
        jmp     done
    fail:
        xor     eax, eax
    done:
        pop     esi
        retd    4
    

    Translated back into pseudo-C++, the code looks like this:

    MyClass* allocate_stuff(int howmany)
    {
      void *p = operator new(
         howmany * sizeof(MyClass) + sizeof(size_t));
      if (p) {
        size_t* a = reinterpret_cast<size_t*>(p);
        *a++ = howmany;
        vector constructor iterator(a,
          sizeof(MyClass), &MyClass::MyClass);
        return reinterpret_cast<MyClass*>(a);
      }
      return NULL;
    }
    

    In other words, the memory layout of the vector of MyClass objects looks like this:

    howmany
    MyClass[0]
    MyClass[1]
    ...
    MyClass[howmany-1]

    The pointer returned by the new[] operator is not the start of the allocated memory but rather points to MyClass[0]. The count of elements is hidden in front of the vector.

    The deletion of a vector performs this operation in reverse:

    void free_stuff(MyClass* p)
    {
        delete[] p;
    }
    
    generates
        mov     ecx, [esp+4] ; p
        test    ecx, ecx
        je      skip
        push    3
        call    MyClass::`vector deleting destructor`
    skip
        ret     4
    
    Translated back into pseudo-C++, the code looks like this:
    void free_stuff(MyClass* p)
    {
      if (p) p->vector deleting destructor(3);
    }
    
    The vector deleting destructor goes like this (pseudo-code):
    void MyClass::vector deleting destructor(int flags)
    {
      if (flags & 2) { // if vector destruct
        size_t* a = reinterpret_cast<size_t*>(this) - 1;
        size_t howmany = *a;
        vector destructor iterator(p, sizeof(MyClass),
          howmany, MyClass::~MyClass);
        if (flags & 1) { // if delete too
          operator delete(a);
        }
      } else { // else scalar destruct
        this->~MyClass(); // destruct one
        if (flags & 1) { // if delete too
          operator delete(this);
        }
      }
    }
    

    The vector deleting destructor takes some flags. If 2 is set, then a vector is being destructed; otherwise a single object is being destructed. If 1 is set, then the memory is also freed.

    In our case, the flags parameter is 3, so we will perform a vector destruct followed by a delete. Observe that this code sucks the original "howmany" value out of its secret hiding place and asks the vector destructor iterator to run the destructor that many times before freeing the memory.

    So now, armed with this information, you should be able to describe what happens if you allocate memory with scalar "new" and free it with vector "delete[]" or vice versa.

    Bonus exercise: What optimizations can be performed if the destructor MyClass::~MyClass() is removed from the class definition?

    Answers to come tomorrow.

  • The Old New Thing

    What goes wrong when you add "Copy To" to the context menu

    • 17 Comments
    Lockergnome tipped people off to this page which talks (among other things) about adding "Copy To" to the context menu. I considered adding this tweak to Tweak UI but ultimately decided against. Here's why:

    The "Copy to Folder" and "Move to Folder" options weren't designed to be on the context menu. They were only meant to be placed in Explorer's toolbar. (Right-click a blank space on your toolbar, select Customize, and pick "Move To" or "Copy To" from the list of available buttons.) If you add them to the context menu, you may notice that the "Copy To" and "Move To" dialogs start showing up when you really aren't expecting them, for example, whenever you double-click an attachment in Outlook.

    The reason is that these two items are a bit too eager. When you ask them, "So do you handle the <X> command?" they say, "Oh yes! That's me!" This is fine in a toolbar, where the only time they're asked "Do you handle the <X> command?" is when the user clicks on the button. But in a context menu, you are asked this much more frequently, and with varying values of X.

    So when Outlook launches an attachment, the shell loads up the context menu handlers and asks each one, "So do you handle the Open command?" The "Delete" option says, "Nope, sorry." So does "Cut" and "Send to" and "Sharing and Security". But "Copy To" happily says, "Oh yes! That's me!"

    And then the Copy To dialog shows up when you don't want it.

    Another example of what happens when you take an object and use it in a situation outside its design parameters.
  • The Old New Thing

    "Section 419" scammers arrested in Netherlands; Danish flag flies proudly

    • 9 Comments
    Dutch police have arrested 52 people suspected of defrauding gullible Internet users in one of the largest busts of the infamous "Nigerian e-mail" scam.
    Hooray for the Dutch police. Their next target: Web sites that illustrate a Dutch article with the Danish flag.

    (I must sheepishly admit that I too mistakenly identified the home of Ikea as Denmark rather than the Netherlands.)

  • The Old New Thing

    The format of string resources

    • 34 Comments
    Unlike the other resource formats, where the resource identifier is the same as the value listed in the *.rc file, string resources are packaged in "bundles". There is a rather terse description of this in Knowledge Base article Q196774. Today we're going to expand that terse description into actual code.

    The strings listed in the *.rc file are grouped together in bundles of sixteen. So the first bundle contains strings 0 through 15, the second bundle contains strings 16 through 31, and so on. In general, bundle N contains strings (N-1)*16 through (N-1)*16+15.

    The strings in each bundle are stored as counted UNICODE strings, not null-terminated strings. If there are gaps in the numbering, null strings are used. So for example if your string table had only strings 16 and 31, there would be one bundle (number 2), which consists of string 16, fourteen null strings, then string 31.

    (Note that this means there is no way to tell the difference between "string 20 is a string that has length zero" and "string 20 doesn't exist".)

    The LoadString function is rather limiting in a few ways:

    • You can't pass a language ID. If your resources are multilingual, you can't load strings from a nondefault language.
    • You can't query the length of a resource string.

    Let's write some functions that remove these limitations.

    LPCWSTR FindStringResourceEx(HINSTANCE hinst,
     UINT uId, UINT langId)
    {
     // Convert the string ID into a bundle number
     LPCWSTR pwsz = NULL;
     HRSRC hrsrc = FindResourceEx(hinst, RT_STRING,
                         MAKEINTRESOURCE(uId / 16 + 1),
                         langId);
     if (hrsrc) {
      HGLOBAL hglob = LoadResource(hinst, hrsrc);
      if (hglob) {
       pwsz = reinterpret_cast<LPCWSTR>
                  (LockResource(hglob));
       if (pwsz) {
        // okay now walk the string table
        for (int i = 0; i < uId & 15; i++) {
         pwsz += 1 + (UINT)*pwsz;
        }
        UnlockResource(pwsz);
       }
       FreeResource(hglob);
      }
     }
     return pwsz;
    }
    

    After converting the string ID into a bundle number, we find the bundle, load it, and lock it. (That's an awful lot of paperwork just to access a resource. It's a throwback to the Windows 3.1 way of managing resources; more on that in a future entry.)

    We then walk through the table skipping over the desired number of strings until we find the one we want. The first WCHAR in each string entry is the length of the string, so adding 1 skips over the count and adding the count skips over the string.

    When we finish walking, pwsz is left pointing to the counted string.

    With this basic function we can create fancier functions.

    The function FindStringResource is a simple wrapper that searches for the string in the default thread language.

    LPCWSTR FindStringResource(HINSTANCE hinst, UINT uId)
    {
     return FindStringResourceEx(hinst, uId,
         MAKELANGID(LANG_NEUTRAL, SUBLANG_NEUTRAL));
    }
    

    The function GetResourceStringLengthEx returns the length of the corresponding string, including the null terminator.

    UINT GetStringResourceLengthEx(HINSTANCE hinst,
     UINT uId, UINT langId)
    {
     LPCWSTR pwsz = FindStringResourceEx
                           (hinst, uId, langId);
     return 1 + (pwsz ? *pwsz : 0);
    }
    

    And the function AllocStringFromResourceEx loads the entire string resource into a heap-allocated memory block.

    LPWSTR AllocStringFromResourceEx(HINSTANCE hinst,
     UINT uId, UINT langId)
    {
     LPCWSTR pwszRes = FindStringResourceEx
                           (hinst, uId, langId);
     if (!pwszRes) pwszRes = L"";
     LPWSTR pwsz = new WCHAR[(UINT)*pwszRes+1];
     if (pwsz) {
       pwsz[(UINT)*pwszRes] = L'\0';
       CopyMemory(pwsz, pwszRes+1,
                  *pwszRes * sizeof(WCHAR));
     }
     return pwsz;
    }
    

    (Writing the non-Ex functions GetStringResourceLength and AllocStringFromResource is left as an exercise.)

    Note that we must explicitly null-terminate the string since the string in the resource is not null-terminated. Note also that the string returned by AllocStringFromResourceEx must be freed with delete[]. For example:

    LPWSTR pwsz = AllocStringFromResource(hinst, uId);
    if (pwsz) {
      ... use pwsz ...
      delete[] pwsz;
    }
    

    Mismatching vector "new[]" and scalar "delete" is an error I'll talk about in a future entry.

    Exercise: Discuss how the /n flag to rc.exe affects these functions.
  • The Old New Thing

    How do we decide what features make it into a product?

    • 0 Comments
    David Lemson has an excellent article titled How do we decide what features make it into Exchange?. Although he's talking about Exchange specifically, the general principles apply to many products.
  • The Old New Thing

    Integer overflow in the new[] operator

    • 21 Comments
    Integer overflows are becoming a new security attack vector. Mike Howard's article discusses some of the ways you can protect yourself against integer overflow attacks.

    One attack vector he neglects to mention is integer overflow in the new[] operator. This operator performs an implicit multiplication that is unchecked:

    int *allocate_integers(int howmany)
    {
        return new int[howmany];
    }
    

    If you study the code generation for this, it comes out to

      mov  eax, [esp+4] ; eax = howmany
      shl  eax, 2       ; eax = howmany * sizeof(int)
      push eax
      call operator new ; allocate that many bytes
      pop  ecx
      retd 4
    

    Notice that the multiplication by sizeof(int) is not checked for overflow. Somebody can trick you into under-allocating memory by passing a value like howmany = 0x40000001. For larger structures, multiplication overflow happens sooner.

    Let's look at a slightly longer example:

    class MyClass {
    public:
      MyClass(); // constructor
      int stuff[256];
    };
    
    MyClass *allocate_myclass(int howmany)
    {
      return new MyClass[howmany];
    }
    

    This class also contains a constructor, so allocating an array of them involves two steps: allocate the memory, then construct each object. The allocate_myclass function compiles to this:

      mov  eax, [esp+4] ; howmany
      shl  eax, 10      ; howmany * sizeof(MyClass)
      push esi
      push eax
      call operator new ; allocate that many bytes
      mov  esi, eax
      test esi, esi
      pop  ecx
      je   fail
      push OFFSET MyClass::MyClass
      push [esp+12]     ; howmany
      push 1024         ; sizeof(MyClass)
      push esi          ; memory block
      call `vector constructor iterator`
      mov  eax, esi
      jmp  loop
    fail:
      xor  eax, eax
    done:
      pop  esi
      retd 4
    

    This function does an unchecked multiplication of the size, then tries to allocate that many bytes, then tells the vector constructor iterator to call the constructor (MyClass::MyClass) that many times.

    If somebody tricks you into calling allocate_myclass(0x200001), the multiplication overflows and only 1024 bytes are allocated. This allocation succeeds, and then the vector constructor tries to initialize 0x200001 of those items, even though in reality only one of them got allocated. So you walk off the end of the memory block and start corrupting memory.

    That's a bad thing.

    To protect against this, you can wrap an integer overflow check around the array allocation.

    template<typename T>
    T* NewArray(size_t n)
    {
      if (n <= (size_t)-1 / sizeof(T))
        return new T[n];
    
      // n is too large - act as if we
      // ran out of memory
      return NULL;
    }
    

    Note: If you use a throwing "new", then replace the "return NULL" with an appropriate throw.

    You can now use this template to allocate arrays in an overflow-safe manner.

    MyClass *allocate_myclass(int howmany)
    {
      return NewArray<MyClass>(howmany);
    }
    

    This generates the following code:

      push edi
      mov  edi, [esp+8] ; howmany
      cmp  edi, 4194303 ; overflow?
      ja   overflow
    
      mov  eax, edi
      shl  eax, 10
      push esi
      push eax
      call operator new
      mov  esi, eax
      test esi, esi
      pop  ecx
      je   failed
      push OFFSET MyClass::MyClass
      push edi
      push 1024
      push esi
      call    
      call `vector constructor iterator`
      mov  eax, esi
      jmp  done
    
    failed:
      xor  eax, eax
    done:
      pop  esi
      jmp  exit
    
    overflow:
      xor     eax, eax
    exit:
      pop  edi
      retd 4
    

    Notice the new code that checks for a possible integer multiplication overflow.

    But how could you get tricked into an overflow situation?

    The most common way of doing this is by reading the value out of a file or some other storage location. For example, if your code is parsing a file that has a section whose format is "length followed by data", somebody could intentionally put an overflow-inducing value into the "length" field, then get somebody else to try to load the file.

    This is particularly dangerous if the filetype is something that is generally considered "not dangerous", like a JPG.

Page 415 of 438 (4,378 items) «413414415416417»