Larry Osterman's WebLog

Confessions of an Old Fogey
Blog - Title

What's wrong with this code, part lucky 13

What's wrong with this code, part lucky 13

  • Comments 35
Today's example is a smidge long, I've stripped out everything I can possibly imagine stripping out to reduce size.

This is a very real world example that we recently hit - only the names have been changed to protect the innocent.

I've used the built-in C++ decorations for interfaces, but that was just to get this stuff to compile in a single source file, it's not related to the bug.

extern CLSID CLSID_FooDerived;
[
    object,
    uuid("0A0DDEDC-C422-4BB3-9869-4FED020B66C5"),
]
__interface IFooBase : IUnknown
{
    HRESULT FooBase();
};

class CFooBase: public IFooBase
{
    LONG _refCount;
    virtual ~CFooBase()
    {
        ASSERT(_refCount == 0);
    };
public:
    CFooBase() : _refCount(1) {};
    virtual HRESULT STDMETHODCALLTYPE QueryInterface(const IID& iid, void** ppUnk)
    {
        HRESULT hr=S_OK;
        *ppUnk = NULL;
        if (iid == IID_FooBase)
        {
            AddRef();
            *ppUnk = reinterpret_cast<void *>(this);
        }
        else if (iid == IID_IUnknown)
        {
            AddRef();
            *ppUnk = reinterpret_cast<void *>(this);
        }
        else
        {
            hr = E_NOINTERFACE;
        }
        return hr;
    }
    virtual ULONG STDMETHODCALLTYPE AddRef(void)
    {
        return InterlockedIncrement(&_refCount);
    }
    virtual ULONG STDMETHODCALLTYPE Release(void)
    {
        LONG refCount;
        refCount = InterlockedDecrement(&_refCount);
        if (refCount == 0)
        {
            delete this;
        }
        return refCount;

    }
    STDMETHOD(FooBase)(void);
};
class ATL_NO_VTABLE CFooDerived :
    public CComObjectRootEx<CComMultiThreadModel>,
    public CComCoClass<CFooDerived, &CLSID_FooDerived>,
    public CFooBase
{
    virtual ~CFooDerived();
    public:
    CFooDerived();
    DECLARE_NO_REGISTRY()
    BEGIN_COM_MAP(CFooDerived)
        COM_INTERFACE_ENTRY(IFooBase)
    END_COM_MAP()
    DECLARE_PROTECT_FINAL_CONSTRUCT()

};

OBJECT_ENTRY_AUTO(CLSID_FooDerived, CFooDerived)

 

As always, tomorrow I'll post the answers along with kudos and mea culpas.

Edit: Fixed missing return value in Release() - without it it doesn't compile.  Also added the addrefs - my stupid mistake.  mirobin gets major props for those ones.

  • Your casts in the QI are broken. They work so long as IUnknown : IFooBase are at the beginning of the vtable, but ... I wouldn't want to be the one depending on a pointer from that code.

    The casts should be something more along the lines of "(void *)(reinterpret_cast<IFooBase *>(this))" and "(void *)reinterpret_cast<IUnknown *>(this);

    You are also not properly incrementing the refcount on the object before you return it.
  • Isn't there going to be a problem with doing this:

    > *ppUnk = reinterpret_cast<void *>(this);

    for a class which is in a multiple-inheritance tree?

    I think you really want:

    > *ppUnk = static_cast<IFooBase*>(this);

  • Also, you should technically be checking ppUnk for NULL before assinging to *ppUnk... (but I doubt this was the answer you were seeking in relation to the core problem).
  • Oh - another problem. According to MSDN the InterlockedIncrement() in the AddRef() isn't going to do what you want on Win95, WinNT3.51 and earlier.
  • mirobin: I'm pretty sure you want static_cast<>(), not reinterpret_cast<>().

    reinterpret_cast<>() is generally considered 'evil'. (Or at least should be)
  • The first thing that strikes me is that you are not AddRefing on your QueryInterface, but I don't think thats the bug you were looking for. Also it seems like a bad idea to implement all the IUnknown methods in CFooBase because ATL has its own internal implementation of IUnknown which you are trampling on. Also in ATL the ref count starts at 0 and its up to the ATL library to give you that initial AddRef. Here you are assuming an initial ref count of 1, which has always been problematic for me in the past.
  • I think Aaron hit it. With reinterpret_cast you are always going to be casting to ATL's IUnknown impl (which will happen to occupy that slot because of the way ATL is structured), so your IUnknown impl will never be called.
  • Aaron: reinterpret_cast<> can certainly let you do some fun & evil things, but in this case its usage should be semantically the same as static_cast<>. Granted, I've never really pinned down any functional distinction between the two casts other than errors thrown by the compiler when compiling ...

    Lonnie: if that is indeed happening in this case, I believe you'll see the same behavior with static_cast<>.

    You are probably correct about the initial refcount, though as the factory code isn't being shown here we can't really tell for certain if its a problem.
  • Lonnie, the reinterpret_cast isn't the problem (although mirobin's comment in the beginning is also valid - for this case it's not a problem, but in general it is a good idea (especially if you're dealing with multiple inheretance)).

    But you're on the right track...

    Aaron, I'm only concerned with operating systems that are currently supported (which doesn't include any of the Win9x operating systems).
  • Re: reinterpret_cast<>() vs static_cast<>()

    In general, for simple types (non-multiple inheritance) reinterpret_cast<>() and static_cast<>() are indeed semantically the same. However, for complex types (which we're dealing with here) the difference is that:
    reinterpret_cast<A*>(b)
    is the same as:
    ((A*)(void*)b)

    which throws away any compiler known type information. static_cast<>(), however, may need to offset the pointer when doing the cast because of the difference in vtbl and local variable offsets between the various types.

    Looking back at the code, however, I think that in this case you're right - since the reinterpret_cast<>() occurs at a class which has single inheritance there will be no difference.

    However - that makes them the same as reinterpret_cast<void*>(this).

    Maybe the problem is more with the ATL code? I've never used ATL, but according to the spec BEGIN_COM_MAP() is going to overload InternalQueryInterface(). Does the base class' QueryInterface() ever get called or do we lose the IUnknown query ability?
  • So a quick look at MSDN seems to indicate that we want the com map to actually be something like:

    BEGIN_COM_MAP(CFooDerived)
    COM_INTERFACE_ENTRY(IFooBase)
    COM_INTERFACE_ENTRY_CHAIN(IFooBase)
    END_COM_MAP()

    or we lose the handling of interfaces in IFooBase::QueryInterface().
  • Aaron, that's a good idea (I'd not even realized that the COM_INTERFACE_ENTRY_CHAIN macro existed - that's cool) but you'll note that IFooBase doesn't have a COM map.

    You're on the right path with your previous comment, the problem is specific to ATL. It's not the BEGIN_COM_MAP that's the issue though. It's somewhere else in the sample.
  • I just noticed that the destructor for CFooBase isn't a virtual function, but I doubt that is near the core of the problem either.

    Re: casting ... Aaron, that makes a ton of sense (why can't books describe it like that? :)). I've tried for a few years to try to interpret docs to figure out what that stupid operator really does under the hood, but
    nothing described it nowhere as clear as you did.

    Armed with my new knowledge, I'd agree that it would be general better practice for static_cast to be used, but (as has already been stated) in this case the usage is harmless.

    The problem that Larry is trying to focus on seems to be the interaction of a base class implementing IUnknown while using ATL. I know I've read about something like this before, but I'll be damned if I can remember what it was.

    We know from Lonnie's post that setting the initial refcount to 1 instead of zero likely results in an object that never gets destroyed ... or does it? Because ATL provides its own implementation of IUnknown...

    I don't suppose that you get different "implementations" of IUnknown depending on how you retrieve the IUnknown *? (meaning that you have two separate ref counts, leading to all sorts of fun about when the object gets destroyed).

    I'm grasping at straws at this point. Multiple inheritance isn't my strong suit...
  • Mirobin: That's a great point about the multiple reference counts. (This is what I get for always rolling my own instead of using ATL).

    Looking at the docs, CComObjectRootEx seems to imply that it has it's own refcount. That's not a problem as long as you never use the IFooBase reference counts.

    That would imply that you really want is something really nasty like:

    virtual ULONG STDMETHODCALLTYPE AddRef(void)
    {
    IUnknown* p;
    if(SUCCESS(QueryInterface(IID_IUnknown, &p)) && (p != this))
    {
    return p->AddRef();
    }
    else
    return InterlockedIncrement(&_refCount);
    }

    and something similarly disgusting for Release().

  • Aaron, you are SO close on this one, I can taste it.

    One more hint: I pared out every line I could possibly pare out of the example. There's something that everyone has overlooked so far that's critical to the bug.
Page 1 of 3 (35 items) 123